Mapping the IVAucher

Featured

As a reaction to the record high of fuel prices, the Portuguese government has updated the IVAucher program, to allow each citizen to recover 10 cents per each liter of fuel spent, up to a maximum of 5 EUR/month. This blog post is not going to discuss whether this is good way of spending the public budget, or if it is going to make a real impact in the lives of the people that manage to subscribe to this program. Instead, I want to focus on data.

Once you subscribe to the program as a consumer, you just need to fill the tank in one of the gas stations that subscribed the program, as businesses. The IVAucher website publishes a list of subscribed stations, which seems to be updated, from time to time. The list is published as a PDF, with 2746 records, ordered by “districto” and “concelho” administrative units.

When I look for the stations around me, in the “concelho” of Lisbon, I found 67 records. In order to know where to go, I would literally need to go through each and check if I know the address or the name of the station. Lisbon is a big city, and I admit that there are lots of street names that I don’t know – and I don’t need to, because this is “why” we have maps. My first though was that this data belonged in a map, and my second though was that the data should be published in such a way that it would enable other people to create maps – and this is how this project was born.

In the five-star deployment scheme for Open Data, PDF is at the very bottom, and it is easy to understand why. There is so much you can do with a format, which is largely unstructured.

In order to be able to process these data, I had to transform it into a structured format, preferentially non proprietary, so I chosen CSV (3 stars). This was achieved using a combination of command-line processing tools (e.g.: pdftotext, sed and grep).

The next step was to publish these data, following the FAIR principles, so that it is Findable, Accessible, Interoperable and Reusable. In order to do that, I have chosen the OGC API Features standard, which allows to publish vector geospatial data on the web. This standard defines a RESTfull API with JSON encodings, which fits the expectations of modern web applications. I used a Python implementation of OGC API Features, called pygeoapi.

Before getting the data into pygeoapi, I had to georeference it. In order to do forward geocoding, I used the OpenCage API, and more specifically a Python client, which is one of the many supported SDKs. After tweaking the parameters, the results were quite good, and I was even able to georeference some incomplete addresses, something that was not possible using the Nominatum OSM API.

The next thing was to get the data into a format which supports geometry. The CSV was transformed into a GeoJSON using GDAL/ogr2ogr. I could have published it as a GeoJSON int pygeoapi, but indexing into a database adds support to more functionality, so I decided to store it in a MongoDB NoSQL data store. Everything was virtualized into docker containers, and orchestrated using this docker-compose file.

The application was deployed in AWS and the collection is available at this endpoint:

https://features.byteroad.net/collections/gas_stations

This means that anyone is able to consume this data and create their own maps, whether they are using QGIS, ArcGIS, JavaScript, Python, etc. All they need is an application which implements the OGC API Features standard.

I also created a map, using React.js and the Leaflet library. Although Leaflet does not support OGC API Features natively, I was able to fetch the data as GeoJSON, by following this approach.

The resulting application is available here:

https://ivaucher.byteroad.net

Now you can navigate through the map until you find you area of interest, or even type an address in the search box, to let the map fly to that location.

Hopefully, this application will make the user experience of the IVAucher program a bit easier, but it will also demonstrate the importance of using standards in order to leverage the use of geospatial information. Making data available on the web is good, but it is time that we move a step forward and question “how” we are making the data available, in order to ensure that its full potential is unlocked.

Geocoding in QGIS with OpenCage

Anyone working with geospatial data, had probably encountered at some point the need for geocoding. The task of transforming an address (e.g.: a placename, city, postcode) into a pair of coordinates (e.g.: a point geometry) is called forward geocoding, while the task of transforming a pair of coordinates into an address is called reverse geocoding.

As of today, there is some support to geocoding in QGIS, using third-party geocoding APIs. A geocoding API is a service which receives as an input an address or a pair of coordinates and returns a point or an address as result. There are many commercial geocoding APIs on the market (including the well-known Google Maps API) and there is one free API (Nominatum) which relies on OSM data. There is no silver bullet in what concerns geocoding, and you should evaluate carefully the option that best suits your use case.

The table bellow shows different QGIS plugins which support geocoding . Some of them are focused on geocoding, while others do a bunch of other things.

PluginDownloadsLast ReleaseForwardReverseAPI KeyFocus on geocodingGeocoding API
MMQGIS1574182021yynnGoogle/OSM/…
GeoCoding1469602018yyyyOSM, Google
GoogleMaps527172021ynyyGoogle
Maptiler156962022ynynMaptiler
Nominatim LF98832021yynyOSM
TravelTime74602023yyynTravelTime
TomTom14502020ynyyTomTom
Comparison between geocoding plugins in QGIS (data from 09/01/2023)

After reviewing these plugins, it became clear that there would be space for one plugin which would address the following items:

  • Bulk processing: Although in some occasions it may be useful to geocode a single instance, this is rarely the case in GIS projects. Moreover, this functionality can be accomplished by an online tool or even using the bulk processing. This line of thought renders the location filter less interesting than a bulking tool.
  • Responsive and performant: Some of the existing geocoding tools are unresponsive while handling a large number of rows. The ability to perform batch (e.g.: asynchronous) geocoding can address some of these issues.
  • Forward/reverse geocoding: Forward geocoding is disproportionately more implemented than reverse geocoding. This could be due to market demand, but also to technological reasons (e.g.: reverse geocoding is not implemented in the QGIS core). Still, if there is not too much effort, it could be nice to offer reverse geocoding to users, even if it is just for a few use cases.
  • Support to options: It would be nice to offer some of the options offered by the API, through the plugin. These could include the restriction to a country (or bounding box) and the ability to control the output fields.
  • Help/Documentation: A lot of the existing plugins have UIs which are not intuitive and do not offer any useful help/documentation. This makes using the plugins (or even finding them) very challenging. Even some resources like a tutorial or a README page on GitHub which could be referenced from the plugin, could improve this situation.
  • Intuitive UI: One of the problems with QGIS plugins is the lack of standardisation of the UI. Some plugins add icons on the toolbar, others add entries in the plugins menu or even in other menus. Some plugins add all of these things, and instead of one widget, they add multiple widgets. This renders the task of finding, setting up and using the plugin sometimes very complicated. One way of overcoming this, is to use the processing UI, which is more or less standard. Although the menu entries can be configured, the look & feel is always the same, and the plugin can always be found through the processing toolbox.

The OpenCage Geocoding plugin is a processing plugin that offers forward and reverse geocoding within QGIS. Being a processing plugin, it benefits from many features out-of-the -box, such as batch/asynchronous processing, integration with the modeller or the ability to run on the python console. It also features a standard UI, with inputs, outputs, options and feedback which should be familiar to processing users.

This plugin relies on the OpenCage geocoding API, and API that offers geocoding worldwide based on different datasets. While OpenCage makes extensive use of Nominatum, it is worth to mention that they do contribute to back to the project, both in terms of funding and of actual code.

Being a commercial API, you will need to sign-up for a key before using this plugin. You can check the different plans on their website. If you choose a trial key, you can sign-up without the need of using a credit card, which is not always the case with other providers.

Although the plugin can be run with minimal configuration using the default options, the configuration parameters leverage the capabilities of the underlying API to generate results that best fit our use case. For instance if you want to geocode addresses and you know that your addresses are all within a given region, you can feed the algorithm with a country name or even a bounding box. This bounding box can be hardcoded, but it can also be calculated from the layer extent, canvas extend or even drawn by hand.

Apart from the formatted address and the coordinates, optionally the algorithm can also return additional structured information about the location in the results. This includes for instance the timezone, the flag of the country and the currency (you can read here what are the different annotations that the API returns). As this may slow down the response, it is switched off by default, to ensure people only request it if they are really interested on this feature.

Whether you want to geocode addresses or coordinates, you may want the resulting address to be in a specific language. If you set the language parameter, the API will do the best effort to return results in that language.

I hope this plugin can be useful to users with different degrees of expertise: from the simplest use case, to the more advanced ones (through the options). Overall, the merits of this plugin are largely due to the capabilities of the processing toolbox and of the OpenCage API.

If you find any issues, please report them in the issue tracker of the project. This plugin is released under GPLV2. Feel free to fork it, look at the code and modify it for other use cases. If you feel like contributing back to the project, Pull Requests are also welcome (:

Happy geocoding!

DevRel – What is that?

Almost a year ago, I heard the term DevRel for the first time when Sara Safavi, from Planet, gave a talk at CodeOp and used that word to describe her new role. I knew Sara as a developer, like myself, so I was curious to learn what this role entailed and understand how it could attract someone with a strong technical background.

It turns out that DevRel – Developer Relations – is as close as you can be to the developer world, without actually writing code. All these things that I used to do in my spare time, like participating in hackathons, writing blog posts, participating in conversations on Twitter, speaking at events, are now the core part of my job. I did them, because they are fun, and also because I believe that ultimately, writing code has an impact in society, and in order to run that last mile we need to get out of our compilers and reach out to the world. Technology is like a piece of art – it only fulfills its mission when it leaves the artist’s basement and it reaches the museums, or at least the living room of someone who appreciates it.

I am happy to say that I am now the DevRel at the Open Geospatial Consortium. In a way, it is a bit ironic that I ended up taking this role in an organization that does not actually produce software as its main outcome. But in a way OGC is the ultimate software facilitator, by producing the standards that will be used by developers to build their interoperable, geospatial aware, products and services. If you are reading this and you are not a geogeek, you may think of W3C as a somehow similar organization: it produces the HTML specification, which is not itself a software, but how could we build all these frontend applications using React, Vue and so many other frameworks, without using HTML? It is that important. Now you may be thinking, “so tell me an OGC standard that I use, or at least know”, and, again, if you are not a geogeek, maybe you won’t know any of the standards I will mention. Even if you use, or have used at some point location data. And this is part of the reason why I am at OGC.

Location data is increasingly part of the mainstream. We all carry devices in our pockets that produce geo referenced data with an accuracy that was undreamed ten years ago. Getting hold of these data opens a world of possibilities for data scientists and data engineers, but in order for all these applications to be able to understand each other we need sound, well articulated standards in place. My main goal as DevRel at OGC will be to bring the OGC standards closer to the developer community, by making them easier to use, and by making sure that they are actually used. And maybe, just maybe, I will also get to write some code along the way.

Interactive Maps within React.js

Recently, I have been teaching a Full-stack development bootcamp at CodeOp (great experience!).

When the students reached project phase, I was very pleased to see a lot of interest in using maps. And that is easy to understand, right? geospatial information is associated to most activities these days (e.g.: travel, home exchange, volunteering), and interactive maps are the backbone of any application which uses geospatial information.

This made me think of a nice way of introducing the students to interactive mapping. I realized that most of them want to do one thing: read an address and display it on the map, which also requires the use of a geocoder. In order to demonstrate how to put all these things together within a React application, which is the framework they are using, I created a small demo on GitHub. This was also an opportunity to practice and improve my front end skills! 🙂

Following a good tradition of GitHub, I started by forking an existing project, which I thought was similar to what I wanted to achieve. Although the project is extremely cool, I realized that I wanted to move in quite a different direction, so I ended up diverging a lot from the original code base.

To implement the map, I used my favourite library for interactive maps, Leaflet. This library is actually packaged as a React component, so it is really easy to incorporate it into an application.

Of course, maps only understand coordinates, and most of the time people have nominal locations such as street names, cities, or even postcodes. This was also the case with my students. Translating strings with addresses to a pair of coordinates is not a trivial task, so the best thing is to leave it up to the experts. I used the Open Cage geocoder, an API to convert coordinates to and from places. Why? It has a much more generous free tier than the Google Maps API, and it is open-source. And although it is built on top of OSM Nominatum, it contains several improvements.

The good news are OpenCage also has a package for JavaScript and Node, and it is really easy to use. This is the piece of code, to retrieve the coordinates from a given string:

    // Adds marker to map and flies to it with an animation
    addLocation =() =>{
      opencage
        .geocode({ q: this.state.input, key: OCD_API_KEY})
        .then(data => {
          // Found at least one result
          if (data.results.length > 0){
              console.log("Found: " + data.results[0].formatted);
              const latlng = data.results[0].geometry;
              const {markers} = this.state
              markers.push(latlng)
              console.log(latlng);
              this.setState({markers})
              let mapInst =  this.refs.map.leafletElement;
              mapInst.flyTo(latlng, 12);
          } else alert("No results found!!");

        })
        .catch(error => {
          console.log('error', error.message);
        });


    }

In order to do this, you need to sign up for a free API key first, and store it within a secrets file (.env).

The application allows the user to type any address, and it will fly to it with an animation, adding a marker on the map.

You can check out the final result at https://leaflet-react.herokuapp.com/

marianella_watercolor

 

 

Data Analytics Bootcamp

I have always dreamed about doing some contribution towards improving the gender balance in technology, which as you may know, is far from ideal.

Fortunately the opportunity arose, when Katrina Walker has invited me to teach the “Data Analytics”  bootcamp at CodeOp, an international code school for women and TGNC individuals.

Over the 6-month course, I will share my hands-on experience with the various stages of the data analysis pipeline, specifically on how to apply various technologies to ingest, model and visualize data insights.

Rather than focusing on a specific technology, I will leverage on the “best tool for the job, approach”, which is what I do when I want to analyse data. This means learning different tools, such as Python, R, SQL or QGIS, and often combine them together.

For me “data analytics” is like a journey, where we start with a high-level problem, translate it into data and algorithms, and finally extract a high-level idea. At the start and the end of journey, we should always be able to communicate with people that are not “data geeks” and this is one idea that I would like to pass in the course.

I will not add anything else, apart that I am really excited to get started!

codeops2

FindMeACoin: a Platform to Support Offline Trading of Cryptocurrencies

Trading crypto currency offline1, in person, is the quickest way of acquiring/selling crypto coins. It is also the only way of not exposing any identity information.

This post presents a platform for finding buyers/sellers for crypto currencies in a certain geographic location. Registered users can find other users on a map, and get in touch with them to arrange a meeting.

findmeacoin

The collected information about the users is kept at the bare minimum.

The platform only puts users in touch. It does not participate or interfere in the trading process, and thus it does not take any liability for what may happen. However, if the transaction is successful it does collect a fee, based on a smart contract.

The platform uses a gamification approach, with the use of avatars, associated to reputation. It aims to be kept simple and user friendly.

At this stage we are looking forward to collecting expressions of interest, and feedback in general. If you want to be the first to try FindMeACoin, please register now!

FindMeACoin is a platform which enables users to trade cryptocurrencies, in a private, reliable, easy and fun way.

Let’s make transactions private again!

1In this context, trading currencies “offline” means trading currencies in person, converting from cash to a cryptocurrency, or vice versa.

 

Docker for Programmers

In some ways, docker can be seen as the holy grail of DevOps: develop locally, ship everywhere.

cports_800.png

Although it is still a relatively recent technology, docker’s adoption curve has been so steep that it has become almost a standard-de-facto in the software industry, for shipping software applications.

docker_use.png

Companies such as CloudBees or Elastic, and Free and Open Source projects such as PostgreSQL or Debian, all make their applications available through the official repositories of docker hub, the largest public container repository, where you can find anything from a text parser to an operating system.

Are people really using docker in production? The answer is “yes”, and perhaps the best use case is Spotify, who is not only using it, but also contributing to its usage, by making available their client Java libraries.

As an earlier adopter, I consider myself as an enthusiast, although I already had some “oops” moments which made me question if I want to be always riding on the “crest of the wave” (specially on production). Overall, I think it is a fascinating technology and I would recommend every programmer to at least know it, and apply it even if just for the simplest use cases: quickly try a software application without “polluting” your local environment, and test your software in a “clean” environment which mimics the customer’s settings. A more serious use of docker could be facilitating a continuous deployment and testing pipeline, in a cloud platform.

I recently took the challenge of Kato global to start teaching a series of docker courses, specially aimed at programmers. The first course will be an introduction, and thus it will not require any prior knowledge of docker, and subsequent courses will build on this knowledge to take students one step further. The idea is to share my first-hand knowledge of using docker in production, by doing “hands-on” courses, for people working in the software industry, with real life challenges. The first course is schedule for September, in Lisbon.

https://www.meetup.com/KATO-Lisbon/events/252827669/

https://www.eventbrite.com/e/braingym-docker-for-programmers-2-day-course-tickets-48117883886

If you are a developer, don’t miss this opportunity to extend your skills set as a DevOps, and find in which ways docker could make your life easier.

docker-course

Hope to meet you in September!

Modular Architectures Made Easier with docker-compose

The Open GeoPortal is a Free and Open Source framework for rapidly discovering, previewing and retrieving curated geospatial data from multiple repositories. It implements a modular architecture, including a database, a search engine and several web applications.

ogp_architecture2

While it can be argued that it is difficult to setup and run such a system, while collaborating with Tufts University, I had the opportunity to dockerize some of these applications and articulate them together in a docker composition.

docker-compose

The final result? the entire framework can be launched within a couple of minutes, with one single command: docker-compose up

If you don’t believe it, check the video bellow! 😉

The Data Ingest API from Joana Simoes on Vimeo.

If you want to try it yourself: git clone https://github.com/OpenGeoportal/Data-Ingest.git. The docker composition lives inside the docker folder.

Have fun with docker-compose! 🙂

Women in Tech: Learn How to Code

If you ask me which sort of women are coders, I would say any.

women.png

It is a fact that despite recent efforts, women are still under represented in IT. Although I think that to change this it is essential to focus on early education, it is true that a lot of women can discover the joys of programming at a later stage of their lives, and not necessarily connected to their main activity. Programming can be, or at least start, as a hobby, or as an extension of another activity. For instance, biologists may find that they want to learn how to code in order to crunch observation data, and makers may find that they want to program their hardware devices in order to schedule a process. Whatever reason which brings people into programming, it is important to say that it is not out of reach for a specific age or academic background.

Although there are no miracles, openness, curiosity and effort, can pave the way to great progresses. And the most important thing is that the journey itself, can be fun.

In this context, a little push in the beginning can save a lot of time and effort. It comes without saying that programming is also a craft, and therefore it requires a lot of self learning. However, getting the basic principles right from the beginning, is likely to put people on the right track, on a more pleasant, fruitful, and specially quicker, path.

Starting in September, I will be teaching an introductory programming course, specially aimed at women (although everyone is, obviously, welcome). The course is designed to guide the students through the initial steps of programming, from logical operations to object oriented concepts. Although Python will be used as the main programming language, I would like to think of this as a more general programming course which will introduce the foundations to start learning and using any object oriented programming language, rather than a specific Python course.

I know by first hand experience that sometimes it is a bit intimidating to be the only woman in the class, and sometimes this can stop us from raising our hands and ask questions, which is an invaluable way to learn and stay motivated. In this course we commit to provide a welcoming environment, for women of all ages to participate in the class and learn about programming.

If you are a woman and your range of interests intersects STEM (Science, Technology and Maths), don’t miss this opportunity of extending your skills. Accept the challenge and embark on a fun journey, which can ultimately bring you a lot of fulfilment and joy.

https://www.eventbrite.com/e/braingym-women-in-tech-python-2-day-workshop-tickets-48063500223

Looking forward to meeting you in September!

Docker & Microservices

In this presentation I share some “lessons learned”, through ups & downs in a “journey” to implement a microservices architecture using the docker framework.

My overall feeling is that although it has been sometimes a “bumpy” road, the microservices paradigm is a good approach to complex software projects, and the docker technology has some really great features in it to support it.

talk_docker

Great crowd at the #DockerBcn meetup: really enjoyed the meeting! Thanks to Dimitris and Skyscanner for hosting the event.