Geocoding is a procedure applied to a number of address records or site locations to generate, for each location, a pair of X and Y coordinates. For more information on geocoder terminology (types of geocoding, tolerances…), refer to the Universal Geocoder user documentation.
This document describes the Geoconcept Universal Geocoder component.
The description of the component breaks down into three separate sections:
- UGC JEE: this is a component designed to handle the integration of the Universal Geocoder (UGC) geocoding engine in the Java Enterprise Edition platform and its subsets, like the Tomcat servlet engine. By extension (deployment of an optional module) the product can be used with the aim of deploying a geocoding web service.
- UGC Command Line
- UGC .NET
Note | |
---|---|
This documentation is not the documentation of Geoconcept Universal Geocoder standalone. |
This component is designed to handle the integration of the Universal Geocoder (UGC) geocoding engine in the Java Enterprise Edition platform and its subsets, such as the Tomcat servlet engine. By extension (deployment of an optional module) the product can be used with the aim of deploying a geocoding web service.
From a functional point of view, geocoding is a procedure applied to a series of address records to obtain their geographic coordinates. For more information about the terminology of the geocoder (types of geocoding, tolerances…) refer to the Universal Geocoder user manual. This manual aims to describe in detail how to deploy Universal geocoder specifically for Java Enterprise Edition (ugc-jee).
JEE Integration
ugc-jee is a JEE platform integration component providing a geocoding service to JEE modules deployed on an application server (in the widest sense, Tomcat type of servlet engine included).
The consumer module can be of any type (webapp, ejb, etc). Similarly, inside the JEE module the consumer can be of any type (servlet, hsp, pojo, etc).
Access mode
The application using the geocoder references the provider via a logical name in the JNDI (Java Naming and Directory interface) directory of the application server. The method to use for setting up the provider will be described later on.
Traditionally, the logical name used is the “java:comp/env” context name “geoconcept/ugc/default”, but it is also possible to use another name, another context, or to set up several providers of different types.
In the interests of simplicity, we will start by describing the most usual utilisation scenario: a single primary provider with naming by default.
Component structure
Adapting the JEE resource and external components
ugc-jee breaks down into several parts. For questions of performances and re-utilisation, the engine (also called the kernel) is written in C++ and is therefore published in the form of native libraries. The integration part (resource adapator) handles the link with the engine by handling its deployment via loading its native libraries. In terms of deployment of files, this then concerns two distinct trees:
In terms of execution, the native libraries will be loaded into memory in the process of the application server instance (or, where appropriate, its partition, depending on the model) at start-up. The resource adaptor part features a java interface and handles the instanced engine in memory directly via JNI.
Another type of provider
The resource adaptor described previously corresponds to the LocalDll type, and this consists of the primary provider, and namely the one that is linked to the geocoding enfine and in practise, handles processing.
In some situations (for various reasons: separated frontal, sharing, architecture, insulation, etc) it is desirable that the module consuming the service is not deployed in the same instance of the application server as the primary ugc provider (for example, each being one being on a different server machine - remoting).
In this case there are several possibilities:
- either, the creation of a dedicated publication module, that will have access to the uge provider (in this case, the protocol used and the procedures are specific to the module created).
- or the consumer is implemented like a client from a module supplied by ugc-jee like, for example, a web service of the ugc-ws module.
- or a module provided by ugc-jee is deployed on the primary provider’s machine and a client resource adaptor for this module is set up on the consumer machine.
This last scenario is transparent for the consumer application: whether the provider is local or remote, the accecss code remains the same, it is simply a question of deployment and configuration (it’s a bit like the Multiple business delegate pattern).
The transport protocol used depends on the publication module / client pair chosen: this could be web service (http/soap) or rmi.
In every scenario, in the framework of the deployment of ugc-jee in installed mode (that is, non SAAS mode) there is a provider instance of the LocalDll type. In other respects, and to the extent that this provider is local (no transfer, no serialisation, etc) it is the solution offering highest performances (latency, bandwidth).
A certain number of add-on modules are supplied with ugc-jee. None are essential to the correct operation of the resource adaptor, and their deployment is optional.
ugc-admin
ugc-admin is a webapp that checks the product has been correctly installed, handles the configuration and tests the application to ensure it is functioning correctly.
ugc-ws
ugc-ws is a webapp that publishes a geocoding web service. It is based on spring-ws (it publishes in document / literal encoding format).
ugc-axis-fusion
ugc-axis-fusion is a webapp that publishes a predeployed geocoding web service on Axis 1.4 (it publishes in rps soapenc encoding format).
ugc-remote
ugc-remote is a webapp for publishing the geocoding service via rmi. It is notably used for remoting in conjunction with the RmiClient type provider.
Configuring the application deployment
A repository made up of a series of files with .ugc.xxi file extensions, constitutes a datasource (datasource).
The service.xml file located in %ugc%/conf defines the general configuration of the geocoding service provider. This contains notably a default configuration for the datasources (default‑datasource). You can define a specific configuration to be assigned to a datasource.
To define a specific fonfiguration for a datasource, it is advisable to use the ugc‑admin sebapp (see the section on DataSourcesConfiguration, and then configuration > create). It is also possible to edit the service.xml file directly.
If no specific configuration has been defined, then the default configuration is applied on deployment of this datasource.
In other respects, it is possible to define certain parameters during the call (via findAddressOptions). In this case the concrete value that a parameter takes (like, for example discrepancy) may originate (in order):
- from the call if it is assigned in findAddressOptions (this assignment is optional)
- from the value indicated at the level of the specific datasource configuration (if a specific datasource configuration has been defined)
- from the value indicated at the level of the default datasource configuration (default-datasource).
Detailed configuration of a datasource
It is possible to define the configuration of the deployment of a referential in a very detailed way.
This definition may correspond to particular needs, and it is not essential since the default configuration will serve perfectly well in the majority of cases.
Some parameters can be defined at the time of the call, and others are set at the creation of the datasource instance and are then subsequently not modifiable.
The parameters you may want to define at the moment of the call will be described in the FindAddressOptions section.
Datasource identification
This section enables identification of a datasource in the administration.
File: geocoding referential utilised. This is contained in %geoconcept%/ugc/reftables or in a sub-directory.
Name: an alias for the name of the datasource (optional)
Publication information
This section allows you to add information about the datasource.
Title:: a title for the datasource.
Abstract: contains a short description of the datasource.
Online resource: HTTP link giving more information on the datasource.
Reference table settings
Version: version of the referential.
Zone meaning: meaning of the "zone" attribute for the address. Example: Post code.
UniqueID meaning: meaning of the uniqueId search identifier.
secondaryZone meaning: meaning of the secondayZone attribute for the address. For example, IRIS code.
StreetSectionId meaning: meaning of the streetSectionId attribute for the address. For example, PID navteq.
Coordinate system: identifier for the reference coordinates system for the publication (Cf. OpenGIS standard).
Country: the country concerned.
Bounds: rectangular footprint for the data in the referential
Run time settings
Cache: activation or not of the cache. This cache enables towns in the referential to be kept in memory.
Max Cache Mem Size is only used if the cache is active. The size of the memory reserved for the cache in Kbytes.
Min processors: the initial number of geocoding instances.
Max processors: the maximum number of geocoding instances.
Finder general settings
City scoring method: calculation method to use for the town score:
- Standard: rapid, but less precise. Recommended for a batch type of utilisation,
- Levenshtein: less rapid, but more precise. Recommended for a utilisation of the type where the address is input in a line.
Street scoring method: calculation method to use for the score of the street:
- Standard: rapid, but less precise. Utilisation is not recommended.
- Levenshtein : less rapid, but more precise. recommended for a utilisation of the type where the addresses are entered as a line, or in a batch.
Min streets: minimum number of streets for a town to be considered as covered. Used for the tolerance at town level ( FindAddressResults.TOLERATE_TYPE_CITY ).
Search strategy:
For more detail about search strategies, refer to the Universal Geocoder user guide.
Finder request defaults settings
Max candidates: maximum number of results returned by a geocoding operation. See FindAddressOptions.candidateCount.
Score threshold: the score above which the geocoding results are retained. See FindAddressOptions. ScoreThreshold.
Score threshold: threshold for the suggestion of "street" type candidates. See FindAddressOptions. ScoreThreshold.
Find type: type of geocoding desired. See FindAddressOptions.geocodingType.
Tolerate geocoding type: geocoding tolerance required. See FindAddressOptions.tolerateGeocodingType.
Max meter error: maximum positioning error (in metres) for a geocoding at street level to be considered as a geocoding on street number. See FindAddressOptions.maxMeterError. Discrepency: lateral offset. See FindAddressOptions.discrepency.
Discrepancy along street: longitudinal offset. See FindAddressOptions.discrepancy.
Favor city match element: gear the search for the town with a descriptive element for the town. See FindAddressOptions.favorCityMatchElement.
Zone match digits: take into account the n first characters of the area attribute for the address. See FindAddressOptions.zoneMatchDigits.
Tests / Troubleshooting for AXIS 1.4
Check that the resource adaptor has loaded correctly when Tomcat is started. You can use the webapp ugc-admin to validate the installation.
When using a Web Service, check that the AddressFinder service is present in axis:
When the provider is used directly inside the java platform (via single java or «POJO» objects), the applet takes the following form:
- restoration a connection to the provider via JNDI
- calling the findAddress function
The findAddress function has the following simplified form:
FindAddressResults findAddress(Address, FindAddressOptions);
This means in effect that the call takes as input an address and some options, and returns as output a certain number of candidates.
The precise breakdown is given below.
Package com.geoconcept.ugc.service
Class CodingProvider
getConnection method:
This method allows you to open a connection to the geocoding provider. A connection must be subsequently closed using its Close method.
Prototpye:
getConnection() connection throws ResourceException;
Value returned:
Connection to the geocoding service provider.
Class Connection
findAddress method:
This method enables an address to be geocoded with geocoding options if required.
Prototype
FindAddressResults findAddress(Address address, FindAddressOptions options)
throws InvalidDataSourceException, InternalErrorException;
Parameter Description
address Address to geocode
options Geocoding options
Value returned
Geocoding result list.
Package com.geoconcept.common.geo
Class Location
This class contains the coordinates found during a geocoding operation.
Members
Type | Name | Description |
---|---|---|
double |
x |
X Coordinates |
double |
y |
Y coordinates |
String |
coordinateSystem |
Coordinates system identifier. "SRS" Identifier ("Spatial Reference System") for the Web Map Service. See the OpenGIS standard. |
Package com.geoconcept.ugc
Class Address
This class contains the description of an address.
Members
Type | Name | Description |
---|---|---|
String |
addressLine |
Concatenation of the number, of the street type and the name of the street. For example, 25 rue de Tolbiac. |
String |
zone |
Town code (for example, the post code 75013) |
String |
cityName |
Name of the town (for example, Paris) |
String |
uniqueID |
Unique code (town, zone) to search for (for example, 75113000) |
String |
secondaryZone |
Secondary code for the address (for example, the INSEE code, the IRIS code…) |
String |
StreetSectionID |
Road section identifier (not yet utilised) |
Class FindAddressOptions
This class contains the configuration for a geocoding operation.
All elements, except for dataSourceName are optional. If they are not assigned, they take their default value, that is optimum in the majority of cases.
Type | Name | Description |
---|---|---|
String |
dataSourceName |
Name of the data source defined in the administrator to use for the geocoding options |
short |
candidateCount |
Maximum number of results to return when geocoding |
String |
findType |
The type of geocoding required. There are three types of geocoding: on towns (FindAddressResults.FIND_TYPE_CITY), on streets (FindAddressResults.FIND_TYPE_STREET) , on street numbers(FindAddressResults.FIND_TYPE_STREET_NUMBER ) |
String |
tolerateFindType |
This property allows you to set and obtain the cumulated value of geocodings tolerated. There are three tolerance levels: |
long |
maxMeterError |
Maximum positioning error (in metres) to consider a geocoding at street level as a geocoding on street number. The type of geocoding will therefore depend on the length of a street. |
double |
discrepancy |
Orthogonal offset or stagger (in metres) to apply to the street found. This allows you to avoid positioning the geocoded address in the middle of the street. |
double |
discrepancyAlongStreet |
Longitudinal stagger (in metres) to apply. This allows you to avoid positioning the geocoded address on a crossroads. |
long |
favorCityMatchElement |
This allows you to improve the search for a town by specifying the element of the town (the name of the town, the town code, or the unique code for a town for which one can be certain of the nature of the data. The assignment of this value will therefore depend on the address to geocode. Three values are possible: the name of the town (FindAddressResults.FAVOR_CITY_NAME), the code for the town (FindAddressResults.FAVOR_ZONE), the unique code for the town (FindAddressResults.FAVOR_UNIQUE_ID ). |
long |
zoneMatchDigits |
This allows you to set the number of characters to use for the matching of the town code stored in the reference table and that passed to parameter for geocoding. A value can only be specified if the favorCityMatchElement member is different from (FindAddressResults.FAVOR_ZONE), and you need to use the whole of the post code. |
int |
scoreThreshold |
Minimal score for propositions from the geocoder to select |
int |
scoreThreshold |
Define a minimum score to ensure candidates of the "street" type are returned. If no street attains this threshold, then the town will be returned as a candidate. |
String |
coordinateSystem |
Define the Reference system for the coordinates to return |
Class FindAddressResults
Class that contains the results of a geocoding operation, classified by score.
Members
Type | Name | Description |
---|---|---|
FindAddressResult[] |
results |
Table of candidates found |
int |
matchType |
Type of geocoding applied |
Constants
Type | Name | Value | Description |
---|---|---|---|
int |
FIND_MATCH_TYPE_CITY |
1 |
Type of geocoding requested: the address must be geocoded on the town |
int |
FIND_MATCH_TYPE_STREET |
2 |
Type of geocoding requested: the address must be geocoded on the street |
int |
FIND_MATCH_TYPE_STREET_NUMBER |
3 |
Type of geocoding requested: the address must be geocoded on the street number |
int |
FOUND_MATCH_TYPE_CITY |
1 |
Type of result geocoding: the address has been geocoded on the town |
int |
FOUND_MATCH_TYPE_STREET |
2 |
Type of geocoding result: the address has been geocoded on the street |
int |
FOUND_MATCH_TYPE_STREET_ENHANCED |
3 |
Tye of geocoding result; the address has been geocoded on the approximate street number |
int |
FOUND_MATCH_TYPE_STREET_NUMBER |
4 |
Type of geocoding result: the address has been geocoded on the exact street number |
int |
TOLERATE_TYPE_CITY |
1 |
Tolerance of the geocoding on the town |
int |
TOLERATE_TYPE_STREET |
2 |
Tolerance of the geocoding on the street |
int |
TOLERATE_TYPE_STREET_ENHANCED |
4 |
Tolerance of the geocoding on the estimated street number |
int |
FAVOR_CITY_NAME |
1 |
Steers the search for the town to be geocoded by providing the name of the town |
int |
FAVOR_ZONE |
2 |
Steers the search for the town to be geocoded by providing the code for the town |
int |
FAVOR_UNIQUE_ID |
3 |
Steers the search for the town to be geocoded with the unique code of the town |
Class FindAddressResult
Class that contains the result of a geocoding operation.
Members
Type | Name | Description |
---|---|---|
Address |
address |
Address found |
Location |
location |
Coordinates found |
double |
score |
Resemblance score attributed between the address to be geocoded and the address found. Varies between 0 (no resemblance) and 1 (the exact address) |
int |
type |
Type of geocoding found. the possible values are: |
Guiding principles
A geocoding operation consists of the following procedures:
- Connect to the geocoding service provider,
- Construction of the address to geocode,
- Construction of geocoding options,
- Execution of the geocoding,
- Exploitation of the result,
- De-connection from the geocoding service provider.
Example
Connection to the geocoding service provider
import com.geoconcept.ugc.service.CodingProvider; import com.geoconcept.ugc.service.Connection; /* *Open a connection on the geocoding server */ public Connection getConnection() throws Exception { Connection connection = null; try { // get context Context initCtx = new InitialContext(); Context envCtx = (Context) initCtx.lookup("java:comp/env"); // retrieves the geocoding server form the logical name CodingProvider codingProvider = (CodingProvider) envCtx.lookup("geoconcept/ugc/default"); connection = codingProvider.getConnection(); } catch (Exception e) { e.printStackTrace(); } return connection; }
Construction of the address to geocode
import com.geoconcept.ugc.Address; /* *Construct an adress to geocode *@param addressLine address number + address way type + address way name. Sample : "25 rue de Tolbiac" *@param zone adress sone. Sample : "75013" *@param cityName city name. Sample : "Paris" *@param uniqueId city unique identifier. Sample : "75113000" * */ public Address getAddressToGeocode(String addressLine, String zone,String cityName,String uniqueId) throws Exception { Address address = new Address(); address.addressLine = addressLine; address.zone = zone; address.cityName = cityName; address.uniqueId = uniqueId; return address; }
Construction of the geocoding options
import com.geoconcept.ugc.FindAddressOptions; /* *Retrieves geocoding options from the data source defined in administration *@param datasource Name of a data source */ public FindAddressOptions getOptions(String datasource) throws Exception { FindAddressOptions options = new com.geoconcept.ugc.FindAddressOptions("myDataSource"); return options; }
Execution of the geocoding
import com.geoconcept.ugc.service.Connection; import com.geoconcept.ugc.Address; import com.geoconcept.ugc.FindAddressOptions; import com.geoconcept.ugc.FindAddressResults; /* *Launch geocode process and retrieves results *@param connection Connection to the geocode server *@param address Address to geocode *@param options Geocoding options */ public FindAddressResults findGeocode(Connection connection, Address address , FindAddressOptions options) throws Exception { FindAddressResults findAddressResults = connection.findAddress(address, options); return findAddressResults; }
Exploitation of the result
import com.geoconcept.ugc.FindAddressResults; import com.geoconcept.ugc.FindAddressResult; /* *Browse geocoding results and display result *@param findAddressResults Results of geocode process */ public void displayResult(FindAddressResults findAddressResults) throws Exception { // if at least one found result if (findAddressResults.results.length > 0) { // Display colunm name system.out.println("Found results : "); system.out.println( "N°"+ "\t" + "Geocoding Type"+ "\t" + "Score"+ "\t" + "Address line"+ "\t" + "Zone"+ "\t" + "City Name"+ "\t" + "City Unique Identifier"+ "\t" + "Adress Secondary Zone"+ "\t" + "Coordinates (Coordinates System)"); // For each found results for (int i = 0; i < findAddressResults.results.length; i++) { // get next found result FindAddressResult findAddressResult = found.results[i]; String coordinateSystem = null; if (findAddressResult.location.coordinateSystem != null) { coordinateSystem = findAddressResult.location.coordinateSystem; } else coordinateSystem = "(Unknown)"; // Display result system.out.println( i + "\t" + findAddressResult.address.type + "\t" + findAddressResult.address.score + "\t" + findAddressResult.address.addressLine + "\t" + findAddressResult.address.zone + "\t" + findAddressResult.address.cityName + "\t" + findAddressResult.address.uniqueId + "\t" + findAddressResult.address.secondaryZone + "\t" + findAddressResult.location.x + ","+ findAddressResult.location.y + "( + coordinateSystem + ")"); } } else { system.out.println("No found result."); } }
De-connection from the geocoding service provider
import com.geoconcept.ugc.FindAddressResult; import com.geoconcept.ugc.service.Connection; /* *Disconnection of the geocode server *@param connection Connection to the geocode server */ public void closeConnection(Connection connection) throws Exception { connection.close(); }
Full example
import com.geoconcept.ugc.service.CodingProvider; import com.geoconcept.ugc.service.Connection; import com.geoconcept.ugc.Address; import com.geoconcept.ugc.FindAddressOptions; import com.geoconcept.ugc.FindAddressResults; public void geocodingSample() { try { // Open connection Connection connection = getConnection(); // Construct the address to geocode Address address = getAddressToGeocode("25 rue de Tolbiac","75013","Paris",""); // retrieves geocode options FindAddressOptions options = getOptions("myDataSource"); // launch geocode process FindAddressResults findAddressResults = findGeocode(connection, address , options); // print geocoding result. displayResult(findAddressResults); // disconnection closeConnection(connection); } catch(Exception e) { // geocoding problem } }