A fix for users of Python 2.6 or 2.7 when installing PyMix:
Tuesday, 15 October 2013
Sunday, 13 October 2013
How to pass arguments to python programs ...
... the right way (in my opinion). This is the basic template I always use. Not complicated, but the general formula might be useful to someone.
It allows you to pass arguments to a program with the use of flags, so they can be provided in any order. It has a 'help' flag (-h), will raise an error if no arguments are passed.
In this example one of the inputs (-a) is supposed to be a file. If the file is not provided, it uses Tkinter to open a simple 'get file' window dialog.
It then parses the other arguments out. Here I have assumed one argument is a string, one is a float, and one is an integer. If any arguments are not provided, it creates variables based on default variables. It prints everything it's doing to the screen.
It allows you to pass arguments to a program with the use of flags, so they can be provided in any order. It has a 'help' flag (-h), will raise an error if no arguments are passed.
In this example one of the inputs (-a) is supposed to be a file. If the file is not provided, it uses Tkinter to open a simple 'get file' window dialog.
It then parses the other arguments out. Here I have assumed one argument is a string, one is a float, and one is an integer. If any arguments are not provided, it creates variables based on default variables. It prints everything it's doing to the screen.
Saturday, 12 October 2013
Kivy python script for capturing displaying webcam
Kivy is very cool. It allows you to make graphical user interfaces for computers, tablets and smart phones ... in Python. Which is great for people like me. And you, I'm sure. And it's open source, of course.
Today was my first foray into programming using Kivy. Here was my first 'app'. It's not very sophisticated, but I find Kivy's documentation rather impenetrable and frustratingly lacking in examples. Therefore I hope more people like me post their code and projects on blogs.
Right, it detects and displays your webcam on the screen, draws a little button in the bottom. When you press that button it takes a screengrab and writes it to an image file. Voila! Maximise the window before you take a screen shot for best effects.
Today was my first foray into programming using Kivy. Here was my first 'app'. It's not very sophisticated, but I find Kivy's documentation rather impenetrable and frustratingly lacking in examples. Therefore I hope more people like me post their code and projects on blogs.
Right, it detects and displays your webcam on the screen, draws a little button in the bottom. When you press that button it takes a screengrab and writes it to an image file. Voila! Maximise the window before you take a screen shot for best effects.
Simple Python script to write a list of coordinates to a .geojson file
This has probably been done better elsewhere, but here's a really simple way to write a geojson file programmatically to display a set of coordinates, without having to bother learning the syntax of yet another python library. Geojson is a cool little format, open source and really easy.
Below, arrays x and y are decimal longitude and latitude respectively. The code would have to be modified to include a description for each coordinate, which could be included easily on line 15 if in the form of an iterable list.
Below, arrays x and y are decimal longitude and latitude respectively. The code would have to be modified to include a description for each coordinate, which could be included easily on line 15 if in the form of an iterable list.
Wednesday, 14 August 2013
Doing Spectral Analysis on Images with GMT
Mostly an excuse to play with, and show off the capabilities of, the rather fabulous Ipython notebook! I have created an Ipython notebook to demonstrate how to use GMT to do spectral analysis on images.
This is demonstrated by pulling wavelengths from images of rippled sandy sediments.
GMT is command-line mapping software for Unix/Linux systems. It's not really designed for spectral analysis in mind, especially not on images, but it is actually a really fast and efficient way of doing it!
This also requires the convert facility provided by Image Magick. Another bodacious tool I couldn't function without.
My notebook can be found here.
Fantastic images of ripples provided by http://skeptic.smugmug.com
This is demonstrated by pulling wavelengths from images of rippled sandy sediments.
GMT is command-line mapping software for Unix/Linux systems. It's not really designed for spectral analysis in mind, especially not on images, but it is actually a really fast and efficient way of doing it!
This also requires the convert facility provided by Image Magick. Another bodacious tool I couldn't function without.
My notebook can be found here.
Fantastic images of ripples provided by http://skeptic.smugmug.com
Labels:
bash,
data analysis,
GMT,
ipython,
photography,
python
Saturday, 27 July 2013
Taking the Grunt out of Geotags
Here's a matlab script to automatically compile all geo-tagged images from a directory and all its subdirectories into a Google Earth kml file showing positions and thumbnails
Like most of my matlab scripts, it's essentially just a wrapper for a system command. It requires exiftool for the thumbnail creation (available on both *nix and for Windows). It also needs the (free) Google Earth toolbox for Matlab.
Right now it's just set up for JPG/jpg files, but could easily be modified or extended. As well as the kml file (Google Earth will launch as soon as the script is finished) it will save a matlab format file contining the lats, longs, times and filenames.
Outputs look a little like this (a selection of over 30,000 photos taken in Grand Canyon)
Friday, 26 July 2013
Thursday, 25 July 2013
How to start a BASH script
Wednesday, 24 July 2013
Destination point given distance and bearing from start point
You have starting points:
xcoordinate, a list of x-coordinates (longitudes)
ycoordinate, a list of y-coordinates (latitudes)
num_samples, the number of samples in the plane towards the destination point
bearings:
heading, a list of headings (degrees)and distances:
range, a list of distances in two directions (metres)
This is how you find the coordinates (x, y) along the plane defining a starting point and two end points, in MATLAB.
Friday, 8 March 2013
Coordinate system conversion using cs2cs from PROJ
I hate geodesy. It's confusing, multifarious, and ever-changing (and I call myself a scientist!). But we all need to know where we, and more importantly our data, are. I have need of a single tool which painlessly does all my conversions, preferably without me having to think too much about it.
You could use an online tool, but when you're scripting (and I'm ALWAYS scripting) you want something a little less manual. Enter cs2cs from the PROJ.4 initiative. It will convert pretty much anything to anything else. Perfect.
In Ubuntu:
sudo apt-get install proj-bin
In Fedora I searched yum for proj-4
If the man page is a little too much too soon to take in, this wonderful page helps you decide which flags and parameters to use, with a rudimentary understanding of what you want
For example, if I want to convert coordinates in Arizona Central State Plane to WGS84 Latitude/Longitude, I use this:
(the State Plane Coordinate System is new to me, but my view of the whole field of geodesy is 'let's make things even more complicated, sorry, accurate!')
Let's decompose that a bit:
tells it I want decimal degrees with 6 values after the decimal point (you could leave this out if you want deg, min, sec)
says use the Transverse Mercator projection
are all parameters related to the input coordinates in Arizona Central State Plane (the 'to_meter' is a conversion from feet, the default unit, to metres)
are the parameters related to the output coordinates in Latitude/Longitude
infile is a two column list of points e.g.
2.19644131000e+005 6.11961823000e+005
2.17676234764e+005 6.11243565478e+005
2.19763457634e+005 6.11234534908e+005
2.19786524782e+005 6.11923555789e+005
2.19762476867e+005 6.11378246389e+005
outfile will be created with the results:
-111.846501 36.517758 0.000000
-111.868478 36.511296 0.000000
-111.845175 36.511202 0.000000
-111.844912 36.517412 0.000000
-111.845185 36.512498 0.000000
By default it will always give you the third dimension (height) even if you didn't ask for it, like above, so I tend to trim it by asking awk to give me just the first 2 columns separated by a tab:
-111.846501 36.517758
-111.868478 36.511296
-111.845175 36.511202
-111.844912 36.517412
-111.845185 36.512498
You could use an online tool, but when you're scripting (and I'm ALWAYS scripting) you want something a little less manual. Enter cs2cs from the PROJ.4 initiative. It will convert pretty much anything to anything else. Perfect.
In Ubuntu:
sudo apt-get install proj-bin
In Fedora I searched yum for proj-4
If the man page is a little too much too soon to take in, this wonderful page helps you decide which flags and parameters to use, with a rudimentary understanding of what you want
For example, if I want to convert coordinates in Arizona Central State Plane to WGS84 Latitude/Longitude, I use this:
(the State Plane Coordinate System is new to me, but my view of the whole field of geodesy is 'let's make things even more complicated, sorry, accurate!')
Let's decompose that a bit:
tells it I want decimal degrees with 6 values after the decimal point (you could leave this out if you want deg, min, sec)
says use the Transverse Mercator projection
are all parameters related to the input coordinates in Arizona Central State Plane (the 'to_meter' is a conversion from feet, the default unit, to metres)
are the parameters related to the output coordinates in Latitude/Longitude
infile is a two column list of points e.g.
2.19644131000e+005 6.11961823000e+005
2.17676234764e+005 6.11243565478e+005
2.19763457634e+005 6.11234534908e+005
2.19786524782e+005 6.11923555789e+005
2.19762476867e+005 6.11378246389e+005
outfile will be created with the results:
-111.846501 36.517758 0.000000
-111.868478 36.511296 0.000000
-111.845175 36.511202 0.000000
-111.844912 36.517412 0.000000
-111.845185 36.512498 0.000000
By default it will always give you the third dimension (height) even if you didn't ask for it, like above, so I tend to trim it by asking awk to give me just the first 2 columns separated by a tab:
-111.846501 36.517758
-111.868478 36.511296
-111.845175 36.511202
-111.844912 36.517412
-111.845185 36.512498
Friday, 8 February 2013
Raspberry pi launch program on boot, and from desktop
The following is an example of how to launch a program running in a terminal upon boot, in Raspbian wheezy:
and add:
This will allow the same program to be launched from the desktop upon double-click:
add
and add:
This will allow the same program to be launched from the desktop upon double-click:
add
Monday, 4 February 2013
Compiling points2grid on Fedora with g++
Gridding large [x,y,z] point cloud datasets. Matlab and python have both failed me with their inefficient use of memory. So I'm having to turn to other tools.
Points2Grid is a tool, written in C++, which promises to do the job in a simple intuitive way. It accepts ascii and las (lidar) formats and outputs gridded data, plus gridded data densities.
First install LIBLAS which is Lidar data translation libraries. The recommended version is 1.2.1. I installed liblas 1.6.1 because it was the oldest version on the site that had a working link. I installed this by issuing:
Points2Grid is a tool, written in C++, which promises to do the job in a simple intuitive way. It accepts ascii and las (lidar) formats and outputs gridded data, plus gridded data densities.
First install LIBLAS which is Lidar data translation libraries. The recommended version is 1.2.1. I installed liblas 1.6.1 because it was the oldest version on the site that had a working link. I installed this by issuing:
Make a note of where the libraries are installed. On my system: /usr/local/include/liblas/
Download points2grid from here.
In Interpolation.h, adjust MEM_LIMIT variable in the file to reflect the memory limitations of your system. From the README guide,
"As a rule of thumb, set the MEM_LIMIT to be (available memory in bytes)/55. For jobs that generate
grids cells greater than the MEM_LIMIT, points2grid goes "out-of-core", resulting in significantly worse performance."
Find this value using:
I then had to point the code to where the liblas directories had been installed. In Interpolation.cpp, replace:
It should then install with a simple 'make' command. Happy gridding!
Saturday, 26 January 2013
Automatic Clustering of Geo-tagged Images. Part 2: Other Feature Metrics
In the last post I took a look at a simple method for trying to automatically cluster a set of images (of Horseshoe Bend, Glen Canyon, Arizona). Some of those picture were of the river (from different perspectives) and some of other things nearby.
With a goal of trying to separate the two classes, the results were reasonably satisfactory with image histograms as a feature metric, and a little user input. However, it'd be nice to get a more automated/objective way to separate the images into two discrete clusters. Perhaps the feature detection method requires a little more scrutiny?
In this post I look at 5 methods using the functionality of the excellent Mahotas toolbox for python downloaded here:
1) Image moments (I compute the first 5)
2) Haralick texture descriptors which are based on the co-occurrence matrix of the image
3) Zernike Moments
4) Threshold adjacency statistics (TAS)
5) Parameter-free threshold adjacency statistics (PFTAS)
The code is identical to the last post except for the feature extraction loop. What I'm looking for is as many images of the river at either low distances (extreme left in the dendrogram) or very high distances (extreme right in the dendrograms). These clusters have been highlighted below. There are 18 images of the river bend.
In order of success (low to high):
1) Zernike (using a radius of 50 pixels and 8 degrees, but other combinations tried):
This does a poor job, with no clusters at either end of the dendrogram.
2) Haralick:
There is 1 cluster of 4 images at the start
3) Moments:
There is 1 cluster of 6 images at the start, and 1 cluster of 2 at the end
4) TAS:
There is 1 cluster of 7 images at the start, and 1 cluster of 5 near the end, and both clusters are on the same stem
5) PFTAS:
The cluster at the end contains 16 out of 18 images of the river bend at the end. Plus there are no user-defined parameters. My kind of algorithm. The winner by far!
So there you have it, progress so far! No methods tested so far is perfect. There may or may not be a 'part 3' depending on my progress, or lack of!
With a goal of trying to separate the two classes, the results were reasonably satisfactory with image histograms as a feature metric, and a little user input. However, it'd be nice to get a more automated/objective way to separate the images into two discrete clusters. Perhaps the feature detection method requires a little more scrutiny?
In this post I look at 5 methods using the functionality of the excellent Mahotas toolbox for python downloaded here:
1) Image moments (I compute the first 5)
2) Haralick texture descriptors which are based on the co-occurrence matrix of the image
3) Zernike Moments
4) Threshold adjacency statistics (TAS)
5) Parameter-free threshold adjacency statistics (PFTAS)
The code is identical to the last post except for the feature extraction loop. What I'm looking for is as many images of the river at either low distances (extreme left in the dendrogram) or very high distances (extreme right in the dendrograms). These clusters have been highlighted below. There are 18 images of the river bend.
In order of success (low to high):
1) Zernike (using a radius of 50 pixels and 8 degrees, but other combinations tried):
2) Haralick:
There is 1 cluster of 4 images at the start
3) Moments:
4) TAS:
5) PFTAS:
The cluster at the end contains 16 out of 18 images of the river bend at the end. Plus there are no user-defined parameters. My kind of algorithm. The winner by far!
So there you have it, progress so far! No methods tested so far is perfect. There may or may not be a 'part 3' depending on my progress, or lack of!
Labels:
clustering,
data analysis,
image analysis,
python
Tuesday, 22 January 2013
Automatic Clustering of Geo-tagged Images. Part 1: using multi-dimensional histograms
There's a lot of geo-tagged images on the web. Sometimes the image coordinate is slightly wrong, or the scene isn't quite what you expect. It's therefore useful to have a way to automatically download images from a given place (specified by a coordinate) from the web, and automatically classify them according to content/scene.
In this post I describe a pythonic way to:
1) automatically download images based on input coordinate (lat, long)
2) extract a set features from each image
3) classify each image into groups
4) display the results as a dendrogram
This first example, 2) is achieved using a very simple means, namely the image histogram of image values. This doesn't take into account any texture similarities or connected components, etc. Nonetheless, it does a reasonably good job at classifying the images into a number of connected groups, as we shall see.
In subsequent posts I'm going to play with different ways to cluster images based on similarity, so watch this space!
User inputs: give a lat and long of a location, a path where you want the geo-tagged photos, and the number of images to download
First, import the libraries you'll need. Then interrogates the website and downloads the images.
As you can see the resulting images are a mixed bag. There's images of the river bend, the road, the desert, Lake Powell and other random stuff. My task is to automatically classify these images so it's easier to pull out just the images of the river bend
The following compiles a list of these images. The last part is to sort which is not necessary but doing so converts the list into a numpy array which is. Then clustering of the images is achieved using Jan Erik Solem's rather wonderful book 'Programming Computer Vision with Python'. The examples from which can be downloaded here. Download this one, then this bit of code does the clustering:
In this post I describe a pythonic way to:
1) automatically download images based on input coordinate (lat, long)
2) extract a set features from each image
3) classify each image into groups
4) display the results as a dendrogram
This first example, 2) is achieved using a very simple means, namely the image histogram of image values. This doesn't take into account any texture similarities or connected components, etc. Nonetheless, it does a reasonably good job at classifying the images into a number of connected groups, as we shall see.
In subsequent posts I'm going to play with different ways to cluster images based on similarity, so watch this space!
User inputs: give a lat and long of a location, a path where you want the geo-tagged photos, and the number of images to download
First, import the libraries you'll need. Then interrogates the website and downloads the images.
As you can see the resulting images are a mixed bag. There's images of the river bend, the road, the desert, Lake Powell and other random stuff. My task is to automatically classify these images so it's easier to pull out just the images of the river bend
The following compiles a list of these images. The last part is to sort which is not necessary but doing so converts the list into a numpy array which is. Then clustering of the images is achieved using Jan Erik Solem's rather wonderful book 'Programming Computer Vision with Python'. The examples from which can be downloaded here. Download this one, then this bit of code does the clustering:
The approach taken is to use hierarchical clustering using a simple euclidean distance function. This bit of code does the dendogram plot of the images:
which in my case looks like this (click on the image to see the full extent):
It's a little small, but you can just about see (if you scroll to the right) that it does a good job at clustering images of the river bend which look similar. The single-clustered, or really un-clustered, images to the left are those of the rim, road, side walls, etc which don't look anything like the river bend.
Next, reorder the image list in terms of similarity distance (which increases in both the x and y directions of the dendrogram above)
Which gives me:
As you can see they are all images of the river bend, except the 2nd from the left on the top row, which is a picture of a shrub. Interestingly, the pattern of a circular shrub surrounded by a strip of sand is visually similar to the horse shoe bend!!
However, we don't want to include it with images of the river, which is why a more sophisticated method that image histogram is required to classify and group similar image ... the subject of a later post.
Labels:
clustering,
data analysis,
image analysis,
python
Sunday, 20 January 2013
Alpha Shapes in Python
Alpha shapes include convex and concave hulls. Convex hull algorithms are ten a penny, so what we're really interested in here in the concave hull of an irregularly or otherwise non-convex shaped 2d point cloud, which by all accounts is more difficult.
The function is here:
The above is essentially the same wrapper as posted here, except a different way of reading in the data, and providing the option to specify a probe radius. It uses Ken Clarkson's C-code, instructions on how to compile are here
An example implementation:
The above data generation is translated from the example in a matlab alpha shape function . Then:
Which produces (points are blue dots, alpha shape is red line):
and on a larger data set (with radius=1, this is essentially the convex hull):
and a true concave hull:
The function is here:
The above is essentially the same wrapper as posted here, except a different way of reading in the data, and providing the option to specify a probe radius. It uses Ken Clarkson's C-code, instructions on how to compile are here
An example implementation:
The above data generation is translated from the example in a matlab alpha shape function . Then:
Which produces (points are blue dots, alpha shape is red line):
and on a larger data set (with radius=1, this is essentially the convex hull):
and a true concave hull:
Labels:
alpha shapes,
data analysis,
hull,
python,
voronoi tessellation
Patchwork quilt in Python
Here's a fun python function to make a plot of [x,y] coordinate data as a patchwork quilt with user-defined 'complexity'. It initially segments the data into 'numsegs' number of clusters by a k-means algorithm. It then takes each segment and creates 'numclass' sub-clusters based on the euclidean distance from the centroid of the cluster. Finally, it plots each subcluster a different colour and prints the result as a png file.
inputs:
'data' is a Nx2 numpy array of [x,y] points
'numsegs' and 'numclass' are integer scalars. The greater these numbers the greater the 'complexity' of the output and the longer the processing time
Some example outputs in increasing complexity:
numsegs=10, numclass=5
numsegs=15, numclass=15
numsegs=20, numclass=50
inputs:
'data' is a Nx2 numpy array of [x,y] points
'numsegs' and 'numclass' are integer scalars. The greater these numbers the greater the 'complexity' of the output and the longer the processing time
Some example outputs in increasing complexity:
numsegs=10, numclass=5
numsegs=15, numclass=15
numsegs=20, numclass=50
Wednesday, 16 January 2013
Compiling Clarkson's Hull on Fedora
A note on compiling Ken Clarkson's hull program for efficiently computing convex and concave hulls (alpha shapes) of point clouds, on newer Fedora
First, follow the fix described here which fixes the pasting errors given if using gcc compiler.
Then, go into hullmain.c and replace the line which says:
at the end of the function declarations (before the first while loop)
Finally, rewrite the makefile so it reads as below It differs significantly from the one provided.
Notice how the CFLAGS have been removed. Compile as root (sudo make)
Subscribe to:
Posts (Atom)