API Reference#
- atlas_profiler.process_dataset(data, geo_classifier=True, geo_classifier_threshold=0.5, include_sample=False, coverage=True, plots=False, indexes=True, load_max_size=None, metadata=None, nominatim=None, datamart_geo_data=None, **kwargs)[source]#
Compute all metafeatures from a dataset.
- Parameters:
data – path to dataset, or file object, or DataFrame
geo_classifier –
Trueto enable geo_classifiergeo_classifier_threshold – Confidence threshold for geo_classifier predictions (default: 0.85). Predictions below this threshold will be treated as “non_spatial”.
include_sample – Set to True to include a few random rows to the result. Useful to present to a user.
coverage – Whether to compute data ranges
plots – Whether to compute plots
indexes – Whether to include indexes. If True (the default), the input is a DataFrame, and it has index(es) different from the default range, they will appear in the result with the columns.
(bytes) (load_max_size) – Target size of the data to be analyzed. The data will be randomly sampled if it is bigger. Defaults to MAX_SIZE, currently 5 MB (5000000). This is different from the sample data included in the result.
metadata – The metadata provided by the discovery plugin (might be very limited).
nominatim – URL of the Nominatim server
datamart_geo_data –
Trueor a datamart_geo.GeoData instance to use to resolve named administrative territorial entities
- Returns:
JSON structure (dict)