What I learned at PyConUK 2017

Phew! What a weekend! I’m on the train, on the way back from PyConUK 2017 and it’s been such an experience. There’s been so much to take in, I’m just going to have a bit of a brain dump here and come back to it over the next few months. Hopefully you will find some of it useful too. Feel free to comment if I’ve missed something out, or got something wrong, or you have a link that I may have missed!

 

Education, Education, Education

There was a strong education track running throughout the conference, with lots of talks, workshops, and kids trying stuff out on Raspberry Pis and micro-bits. I hadn’t come across micro-bits before and they look awesome. Got to get one. There is plenty for developers to get involved with too.
Also awesome is 13yo Josh, who has developed Edublocks – an application to help students transition from block-based programming systems like scratch to Python. It’s on Github, so all contributions are welcome.

Kushal Das demonstrated you can impress women by talking about python. And Anwesha went on to bring Pyladies to India.

You can learn for free with the Open University – Learning to Code for Data Analysis is available on OpenLearn 24/7, whenever you are ready to learn. There is also a FutureLearn version, which only runs at scheduled times, but has full support and discussions. I may have got those the wrong way round, but you get the idea.

 

Data, Data, Everywhere, and Not a Drop to Drink

PyData also had a track of talks covering the challenges of working with and analysing data. I’ve included a few other talks here that probably aren’t in the official data track, but still cover similar ground.

I actually saw the Natural language with word vectors talk by mistake – I was intending to be somewhere else – but it turned out to be one of the most interesting talks of the weekend. Slides here.

Pandas is one of those libraries that everyone in the data community uses, but I’ve always struggled to get to grips with. Alexander Hendorf‘s talk on pandas indexes was great though, with lots of simple examples. There is a static version of the notebook too.

Tom Augspurger is handy for further pandas resources.

David Seddon’s talk on Database concurrency with Django was so packed, I was sitting on the floor and couldn’t see the slides. He was so clear and enthusiastic though, I could follow it easily. And now I know about database concurrency.

I also now know that Bokeh rhymes with Okay (and just this second learned the word bokeh comes from the Japanese word for the blurred region of a photo with a narrow depth of field). It’s one of those words I’ve seen written down, but never heard anyone say it before. Anyway, it is awesome, both for data visualisation and as a backend for your web application. I also found out about argo floats, which are pretty awesome.

Something close to my day job in natural hazard modelling was this talk on earthquake analysis. Lots of good tools are available, as seen in the slides [pdf], especially Earthquakes, Quake Feeds, Matplotlib Basemap, and obspy.

 

Other Cool Things

I went to a workshop on git low-level commands; these are known as plumbing commands (not to be confused with the high-level “porcelain” commands we already know – pull, diff, add, etc.). Some examples:

git hash-object -w <path>     [add a file to the object store]
git update-index --add <path> [add a file to the index]
git write-tree                [write current index to the tree]

These three commands are the equivalent to the porcelain git add command. And so on.
During the discussion, someone mentioned this blog post, which sums up git perfectly. If it was consistent, it would be so boring.

Another great talk was by the very enthusiastic Simon Davy, all about the WSGI app platform Talisker. It wraps your existing tools, providing setup, logging, standardised endpoints, etc. In particular, he talked about the benefits of logging to stderr, for example:

  • works in dev
  • handles multiple processes/threads
  • is agnostic about paths/permissions
  • OS does persistence and rotation
  • avoids stdout buffering
  • still want on disk logs

And a word of warning: if your error-reporting tool says you have no errors then it is broken.

James Campbell told us all about trading cryptocurrencies with python, using backtrader and ccxt. *THIS IS NOT FINANCIAL ADVICE* Apparently it’s a good time to get into the market as it is young, fast-moving, and so far has shown some good returns. *DEFINITELY NOT FINANCIAL ADVICE*

There was some good advice from Mark Smith on refactoring your code. Keep interface changes to a minimum, as every time you release a breaking change, it’s an opportunity for your users to switch to another library. Also, it’s a good idea to always use semantic versioning.

One useful tip from a talk on logging: instead of log.info(“Hello {}”.format(name)), you should log.info (“Hello %s”, name). I’ve always used the first version, so will try and use the second from now on.

 

Quick roundup of useful Tools

Hypothesis can find edge cases for your tests.
Ortools used by geotechnical engineers alongside scipy.
Bluedot for wireless communication with Raspberry pi, etc. See also lightning talk slides.
MyPy uses static typing to help check for errors.

 

The Ones That Got Away

With so many concurrent sessions, I didn’t get to see everything I wanted, and by the sounds of it, these talks were also really good. I’ll add links to the slides if I manage to get them.
Everyday Security Issues And How To Avoid Them. Abstract.

How Close Can I Get Amazon’s Alexa To Black Mirror’s Cookie. Abstract. Slides.

Lazy sequences working hard. Abstract. Slides.

Add Guis To Your Data Pipelines With Jupyter Widgets. Abstract.
No slides for this, as apparently the whole session was conducted in a Jupyter notebook. Good work!

Finding Bugs For Free: The Magic Of Code Analysis. Abstract.
No slides yet, but handy tool can be found at https://lgtm.com/

 

PyCon has a YouTube Channel

I haven’t looked yet, but it’s probably all on there somewhere.
https://www.youtube.com/channel/UChA9XP_feY1-1oSy2L7acog/videos

 

Quick Plug for the Python Software Foundation

“The mission of the Python Software Foundation is to promote, protect, and advance the Python programming language, and to support and facilitate the growth of a diverse and international community of Python programmers.”
python.org/psf

 

And lastly, the main thing I’ll be taking home from this weekend is that when giving a talk, you must INCREASE YOUR FONT SIZE!

Advertisements

Complicated BASIC schemes

Nowadays, most of us have used Google Maps or OSM to get directions to somewhere or plan a route. With the right datasets, you can perform more detailed analysis, such as this recent Esri blog post showing how to calculate the slope of a hiking trail. It wasn’t always this easy though.

I recently picked up a copy of that 1987 classic, The Ordnance Survey Outdoor Handbook (Macmillan London Ltd, ISBN 0-333-42505-7). It’s actually quite a useful and detailed book, covering map scales, grid references, the country code, weather, first aid, and so on. It has geological, botanical, and sociological histories of Britain, and there are sections on identifying plants, animals, and landscapes. There is also a list of radio station frequencies (both MW and VHF), and phone numbers for weather forecasts (with 01 codes for the London numbers – remember them?).

The navigation chapter was particularly interesting – after discussing identifying landmarks, taking bearings, and navigating without a compass, the author moved on to determining the difficulty of a walk.

Apparently, many guidebooks show difficulty but unless you know the scheme being used, it may not be useful.

“There is an international standard scheme for grading the difficulty of climbs and mountain walks, and efforts are being made to devise and agree a similar standard system for all walkers that will apply throughout Europe. Until such a system appears, you will have to improvise as best you can, and the scheme offered here may provide a basis for you.”

A scheme to calculate the difficulty of a walk

To be honest, I’m struggling a bit with this scheme. It’s too complicated and the scoring system is fairly arbitrary. I can’t imagine anyone calculating the percentage of their route that covers metalled roads, open ground, muddy ground, large boulders, etc. I’ve always used Naismith’s Rule, which is much more straight forward, and gives a pretty good estimate of the time needed to complete the route.

As the text states, there is a BASIC program to help you calculate it (though it doesn’t help you calculate the percentages, and only allows you to enter up to four, rounded up to the nearest 25%). Nothing dates a publication more than using the latest technology, and this listing really makes the book feel like it’s from another era.

Calculating the difficulty of a walk

In the interests of bringing things bang up-to-date (and firmly placing them in the early 2010s), here’s a Python version, so we can all have a go. Now, is that muddy ground or stony ground?

print “CALCULATING THE DIFFICULTY OF A WALK”
print “How long is the walk? Is it:”
print “less than 6km? Type 1;”
print “6-10km? Type 2;”
print “11-15km? Type 3;”
print “16-25km? Type 4;”
print “more than 25km? Type 5.”
a=int(raw_input())

print “\n\nWhat is the terrain like? Select from the list”
print “below. After each selection, type on the next”
print “line .25, .5, .75, or 1, to show the proportion”
print “of the walk accounted for by each type. You may”
print “choose up to 4. Type a zero (0) and on the”
print “following line a 1 for any choices you do not”
print “use. Is the terrain:”
print “metalled road? Type 1;”
print “well-made path? Type 2;”
print “firm beach? Type 2;”
print “open ground? Type 3;”
print “muddy ground? Type 4;”
print “stony ground? Type 4;”
print “loose sand? Type 4;”
print “large boulders? Type 5;”
print “heather or tussocky ground with no path? Type 5;”
print “ice or snow? Type 6.”
b=int(raw_input())
c=float(raw_input())
d=int(raw_input())
e=float(raw_input())
f=int(raw_input())
g=float(raw_input())
h=int(raw_input())
i=float(raw_input())

print “\n\nHow much climbing and descending will you do?”
print “less than 50m? Type 1;”
print “51-100m? Type 2;”
print “101-300m? Type 3;”
print “301-500m? Type 4;”
print “501-700m? Type 5;”
print “701-1000m? Type 6;”
print “more than 1000m? Type 7.”
j=int(raw_input())

k=(b*c)+(d*e)+(f*g)+(h*i)
l=a*k*j
a=6-a
m=a*j
n=(l+m)/5

print “n=” + str(n)
print “\n\n”
if n<6:
print “The walk will be easy”
elif n>=6 and n<11:
print “The walk will be moderate”
elif n>=11 and n<16:
print “The walk will be fairly strenuous”
elif n>=16 and n<21:
print “The walk will be strenuous”
elif n>=21 and n<26:
print “The walk will be very strenuous”
elif n>=26:
print “The walk will be very strenuous and difficult”