src.digitaltwin.data_to_db

This script fetches geospatial data from various providers using the ‘geoapis’ library and stores it in the database. It also saves user log information in the database.

Attributes

log

Exceptions

NoNonIntersectionError

Exception raised when no non-intersecting area is found.

Functions

get_nz_geospatial_layers(→ pandas.DataFrame)

Retrieve geospatial layers from the database that have a coverage area of New Zealand.

get_non_nz_geospatial_layers(→ pandas.DataFrame)

Retrieve geospatial layers from the database that do not have a coverage area of New Zealand.

get_geospatial_layer_info(→ Tuple[str, int, str, str])

Extract geospatial layer information from a single layer entry.

get_vector_data_id_not_in_db(→ Set[int])

Get the IDs from the fetched vector_data that are not present in the specified database table

nz_geospatial_layers_data_to_db(→ None)

Fetch New Zealand geospatial layers data using 'geoapis' and store it into the database.

get_non_intersection_area_from_db(→ geopandas.GeoDataFrame)

Get the non-intersecting area from the catchment area and user log information table in the database

process_new_non_nz_geospatial_layers(→ None)

Fetch new non-NZ geospatial layers data using 'geoapis' and store it into the database.

process_existing_non_nz_geospatial_layers(→ None)

Fetch existing non-NZ geospatial layers data using 'geoapis' and store it into the database.

non_nz_geospatial_layers_data_to_db(→ None)

Fetch non-NZ geospatial layers data using 'geoapis' and store it into the database.

store_geospatial_layers_data_to_db(→ None)

Fetch geospatial layers data using 'geoapis' and store it into the database.

user_log_info_to_db(→ None)

Store user log information to the database.

Module Contents

src.digitaltwin.data_to_db.log
exception src.digitaltwin.data_to_db.NoNonIntersectionError

Bases: Exception

Exception raised when no non-intersecting area is found.

src.digitaltwin.data_to_db.get_nz_geospatial_layers(engine: sqlalchemy.engine.Engine) pandas.DataFrame

Retrieve geospatial layers from the database that have a coverage area of New Zealand.

Parameters:

engine (Engine) – The engine used to connect to the database.

Returns:

Data frame containing geospatial layers that have a coverage area of New Zealand.

Return type:

pd.DataFrame

src.digitaltwin.data_to_db.get_non_nz_geospatial_layers(engine: sqlalchemy.engine.Engine) pandas.DataFrame

Retrieve geospatial layers from the database that do not have a coverage area of New Zealand.

Parameters:

engine (Engine) – The engine used to connect to the database.

Returns:

Data frame containing geospatial layers that do not have a coverage area of New Zealand.

Return type:

pd.DataFrame

src.digitaltwin.data_to_db.get_geospatial_layer_info(layer_row: pandas.Series) Tuple[str, int, str, str]

Extract geospatial layer information from a single layer entry.

Parameters:

layer_row (pd.Series) – A geospatial layer row that represents a single geospatial layer along with its associated information.

Returns:

A tuple containing the values for data_provider, layer_id, table_name, and unique_column_name.

Return type:

Tuple[str, int, str, str]

src.digitaltwin.data_to_db.get_vector_data_id_not_in_db(engine: sqlalchemy.engine.Engine, vector_data: geopandas.GeoDataFrame, table_name: str, unique_column_name: str, area_of_interest: geopandas.GeoDataFrame) Set[int]

Get the IDs from the fetched vector_data that are not present in the specified database table for the area of interest.

Parameters:
  • engine (Engine) – The engine used to connect to the database.

  • vector_data (gpd.GeoDataFrame) – A GeoDataFrame containing the fetched vector data.

  • table_name (str) – The name of the table in the database.

  • unique_column_name (str) – The name of the unique column in the table.

  • area_of_interest (gpd.GeoDataFrame) – A GeoDataFrame representing the area of interest.

Returns:

The set of IDs from the fetched vector_data that are not present in the specified table in the database.

Return type:

Set[int]

src.digitaltwin.data_to_db.nz_geospatial_layers_data_to_db(engine: sqlalchemy.engine.Engine, crs: int = 2193, verbose: bool = False) None

Fetch New Zealand geospatial layers data using ‘geoapis’ and store it into the database.

Parameters:
  • engine (Engine) – The engine used to connect to the database.

  • crs (int = 2193) – The coordinate reference system (CRS) code to use. Default is 2193.

  • verbose (bool = False) – Whether to print messages. Default is False.

src.digitaltwin.data_to_db.get_non_intersection_area_from_db(engine: sqlalchemy.engine.Engine, catchment_area: geopandas.GeoDataFrame, table_name: str) geopandas.GeoDataFrame

Get the non-intersecting area from the catchment area and user log information table in the database for the specified table.

Parameters:
  • engine (Engine) – The engine used to connect to the database.

  • catchment_area (gpd.GeoDataFrame) – A GeoDataFrame representing the catchment area.

  • table_name (str) – The name of the table in the database.

Returns:

The non-intersecting area, or the original catchment area if no intersections are found.

Return type:

gpd.GeoDataFrame

Raises:

NoNonIntersectionError – If the non-intersecting area is empty, it suggests that the catchment area is already fully covered.

src.digitaltwin.data_to_db.process_new_non_nz_geospatial_layers(engine: sqlalchemy.engine.Engine, data_provider: str, layer_id: int, table_name: str, area_of_interest: geopandas.GeoDataFrame, crs: int = 2193, verbose: bool = False) None

Fetch new non-NZ geospatial layers data using ‘geoapis’ and store it into the database.

Parameters:
  • engine (Engine) – The engine used to connect to the database.

  • data_provider (str) – The data provider of the geospatial layer.

  • layer_id (int) – The ID of the geospatial layer.

  • table_name (str) – The database table name of the geospatial layer.

  • area_of_interest (gpd.GeoDataFrame) – A GeoDataFrame representing the area of interest.

  • crs (int = 2193) – The coordinate reference system (CRS) code to use. Default is 2193.

  • verbose (bool = False) – Whether to print messages. Default is False.

src.digitaltwin.data_to_db.process_existing_non_nz_geospatial_layers(engine: sqlalchemy.engine.Engine, data_provider: str, layer_id: int, table_name: str, unique_column_name: str, area_of_interest: geopandas.GeoDataFrame, crs: int = 2193, verbose: bool = False) None

Fetch existing non-NZ geospatial layers data using ‘geoapis’ and store it into the database.

Parameters:
  • engine (Engine) – The engine used to connect to the database.

  • data_provider (str) – The data provider of the geospatial layer.

  • layer_id (int) – The ID of the geospatial layer.

  • table_name (str) – The database table name of the geospatial layer.

  • unique_column_name (str) – The unique column name used for record identification in the database table.

  • area_of_interest (gpd.GeoDataFrame) – A GeoDataFrame representing the area of interest.

  • crs (int = 2193) – The coordinate reference system (CRS) code to use. Default is 2193.

  • verbose (bool = False) – Whether to print messages. Default is False.

src.digitaltwin.data_to_db.non_nz_geospatial_layers_data_to_db(engine: sqlalchemy.engine.Engine, catchment_area: geopandas.GeoDataFrame, crs: int = 2193, verbose: bool = False) None

Fetch non-NZ geospatial layers data using ‘geoapis’ and store it into the database.

Parameters:
  • engine (Engine) – The engine used to connect to the database.

  • catchment_area (gpd.GeoDataFrame) – A GeoDataFrame representing the catchment area.

  • crs (int = 2193) – The coordinate reference system (CRS) code to use. Default is 2193.

  • verbose (bool = False) – Whether to print messages. Default is False.

src.digitaltwin.data_to_db.store_geospatial_layers_data_to_db(engine: sqlalchemy.engine.Engine, catchment_area: geopandas.GeoDataFrame, crs: int = 2193, verbose: bool = False) None

Fetch geospatial layers data using ‘geoapis’ and store it into the database.

Parameters:
  • engine (Engine) – The engine used to connect to the database.

  • catchment_area (gpd.GeoDataFrame) – A GeoDataFrame representing the catchment area.

  • crs (int = 2193) – The coordinate reference system (CRS) code to use. Default is 2193.

  • verbose (bool = False) – Whether to print messages. Default is False.

src.digitaltwin.data_to_db.user_log_info_to_db(engine: sqlalchemy.engine.Engine, catchment_area: geopandas.GeoDataFrame) None

Store user log information to the database.

Parameters:
  • engine (Engine) – The engine used to connect to the database.

  • catchment_area (gpd.GeoDataFrame) – A GeoDataFrame representing the catchment area.