Surveying, Mapping and GIS

Exploring all aspects of mapping and geography, from field data collection, to mapping and analysis, to integration, applications development and enterprise architecture...

  • Geospatial Technology, End to End...

    Exploring all aspects of mapping and geography, from field data collection, to mapping and analysis, to integration, applications development, enterprise architecture and policy

Vector Data, SOA and Scalability

Posted by Dave Smith On 6/26/2007 04:35:00 PM 3 comments

One of the things I am still trying to get my head around is scalability in vector-based web services, such as OGC Web Feature Services or ArcIMS Feature Services. Certainly with image services, one can do a lot of magic behind the scenes - such as tiling, caching, load balancing.

In many instances, an image service will suffice well, but for power users, for ad-hoc queries and analysis, the full geometry and attribute data is often needed. And in the case of a distributed enterprise, here is one place where a purely SOA-oriented approach begins to break down.

Things becomes a bit more difficult when it comes to vector geometry, as most GIS clients are still only geared toward consuming and processing vector data in one chunk.

Further, vector geometries can't well be broken into tiles without causing other breakage - polygons and linear features need to retain their topological integrity in order to work.

Yes, one can certainly cache vector feature services, provided the underlying data is relatively static, or apply constraints limiting the amount of data that one can fetch at a time, but is there any possibility, looking down the road, of utilizing more efficient serial or multiple parallel processes to rapidly and efficiently stream large and complex vector datasets?

I think there will need to be, and I don't yet see OGC, ESRI or anyone else looking at this. I'd be interested in hearing other folks' thoughts and experiences on this...

3 Response for the " Vector Data, SOA and Scalability "

  1. Bill says:

    Most of the problem is related to storing plain lists of points.

    Several alternative datastructures for storing geometry, most notably quad-trees, yeild easier solutions to these problem by building the conceptual processing models into the data storage format.

  2. Josh Bing says:

    I work for a company that needs wireless data collection tool that will enable me to make custom forms on my Blackberry or Motorola phone. It would be great if form can support drop down menus, GPS, check boxes, bar coding and photo capture. Can someone help me with this please?

  3. I've been thinking about this a good bit. I think good caching can take us quite far, having vector data be more like a 'checkout' of source code, a la cvs or svn. If you're interested in an area then you wait a bit to download it and then you'd have the cached copy. If you leverage http's conditional GET then you can even deal with fairly dynamic vector data, having clients ask each time if they should reload, and then doing an actual reload if somethings changed.

    As for dividing things up, I think you can do some tiling, you just have to let tiles retrieve more than their strict area, and be prepared for duplicates.

    For streaming vector data, with GeoServer we're doing about 10Mb/s of fairly complex geometries, which would fill up a browser really fast, and even brings Google Earth to a halt in not too much time. And we're getting those speeds from a single PC. Yes, you'd have to wait awhile for 20 gigs, but with good caching you'd only have to do that check out once. And you could start working in the area you care about while the rest is downloading.