Part 5 of my "ex-libris" of a Data Scientist is now available. This one is about visualization.
Starting from a historical perspective, particularly of statistical visualization, and covering a few classic must have books, the article then goes on to cover graphic design, cartography, information architecture and design and concludes with many recent books on information visualization (specific Python and R books to create these were listed in part IV of this series). In all, about 66 books on the subject.
Just follow the link to the LinkedIn post to go directly to it:
Recent readings (can you guess/decipher some of them?)
I've been fairly quiet on this particular blog this year. Beside a lot of data science work, I've done presentations at meetups and conferences, including a recent tutorial on "Getting to know your data at scale" at the IEEE SouthEastCon 2017. Notebooks will be posted on github soon.
But, in the meantime...
Ex-libris
Something else I've been doing is publishing a few articles here and there. Just recently, I started a 6 part series on LinkedIn called "ex-libris" of a Data Scientist. I think many readers of this blog will appreciate this series, and particularly this first installment on "data and databases":
It covers a good variety of books on the subject, some pretty much must read for whatever corner of the computer science world you live in. Also of interest will be the Postgres, Hadoop and graph database pointers and a list of over 20 curated must read papers in the field.
Don Jennings and I presented a tutorial at PyData Carolinas 2016: Datascience on the web.
The plan was as follow:
Description
Learn to deploy your research as a web application. You have been using Jupyter and Python to do some interesting research, build models, visualize results. In this tutorial, you’ll learn how to easily go from a notebook to a Flask web application which you can share.
Abstract
Jupyter is a great notebook environment for Python based data science and exploratory data analysis. You can share the notebooks via a github repository, as html or even on the web using something like JupyterHub. How can we turn the work we have done in the notebook into a real web application?
In this tutorial, you will learn to structure your notebook for web deployment, how to create a skeleton Flask application, add a model and add a visualization. While you have some experience with Jupyter and Python, you do not have any previous web application experience.
Bring your laptop and you will be able to do all of these hands-on things:
get to the virtual environment
review the Jupyter notebook
refactor for reuse
create a basic Flask application
bring in the model
add the visualization
profit!
Now that is has been presented, the artifacts are a github repo and a youtube video.
The unrefactored notebook is here while the refactored one is here.
Once you run through the whole refactored notebook, you will have train and test sets saved in data/ and a trained model in trained_models/. To make these available in the tutorial directory, you will have to run the publish.sh script. On a unix like environment (mac, linux etc):
I'll be presenting on the subject of Professional Audio-Video Production on Linux, next week at TriLug.
From concept to finished product, it has never been easier to obtain professional results when it comes to audio-video production on Linux.
We will cover some of the hardware that should be part of your production suite, from microphones to jog wheels and highlight some of the top tools for animation, audio, broadcasting, effects, modeling, music, transcoding and video. We will also go beyond the usual suspects and introduce some tools that might not be typically used for AV production.
By the end of the presentation, you will have all the tools you need to improve the quality of your communications, for your personal enjoyment, your career, or your business.
The above, looks suspiciously like a printout from my first session with Apple Logo (the language, not the branding), before I figured the command for "pen up"...
A few months back, I was reading a few books and found the above in one of them. It is titled "Logic of Chance", by John Venn (mostly known for the Venn diagram). The year? 1866.
So, where were we? Ah yes...
3/14/16
Yes, that famous sequence of number. What was the story with John Venn and pi, here? Whereas I used digits 0-9 in "the 10 colors of pi", John used digits 0-7, discarding all 8s and 9s. Since back then there were no computers, he picked his numbers from a book (by R. Shank) which had 707 digits of pi, leaving him with 568 digits between 0 and 7. He mapped 0 to 7 to directions (10 directions might have felt a bit odd, at 36 degrees, versus nice 45 degree lines):
Although he doesn't specify the mapping, it is easy to infer from the graph. The first digit after the decimal is 1, then 4 and we can see the path as NE, then S, so:
0
N
1
NE
2
E
3
SE
4
S
5
SW
6
W
7
NW
The random walk
He would then move by 1 unit in the direction of each digit / direction mapping. NE, S, NE, SW, skip 9, E, so on and so forth. (NB: This is easy to reproduce in python with the turtle module. A quick search of my blog will get you started on this, from a pi generator to import turtle.)
His conclusion stated:
"The result seems to me to furnish a very fair graphical indication of randomness".
About 3 years ago, I wrote a piece titled "Going in the wrong direction" (well worth your time, go ahead and read it). In it, I highlighted the issue of the high cost of computing for experimentation and innovation, particularly when it comes to students. This obviously has impact on STEAM and school budgets too. I suggested that we'd see $20 and even sub $20 computers very soon.
I had to revisit the original story at the beginning of 2015, because the price of each iteration of the Raspberry Pi entry level model kept going down. It looked like sub $20 was close, at least as I was picturing it in my mind. At the same time, the higher spec model kept getting better (see my article on 3D Future Tech as to why that is possible)
I'll CHIP in $9
Earlier this year, a kickstarter campaign introduced the CHIP, a $9 computer. According to http://getchip.com they will sell it this coming Monday for $8!
How low can you go?
Meet the $5 #pizero
The Raspberry Pi foundation is now selling a $5 version of the Raspberry Pi. It is half the size of the Model A+ and a quarter of the price...
And yet another price model that totally disrupts the field. Just look at that:
So now, we've reached the price level where distribution and shipping cost impact more than the cost of the computer itself. This is the next problem to solve in bridging the digital divide.
Another exciting Winston-Salem Section meeting at CDI ! Wednesday, August 12, 2015 at 11:30am.
Presenter: Francois Dion
Originally from Montreal, Canada, Francois Dion is a Coder, Data Scientist, Entrepreneur, Hacker, Mentor, Musician, Polyglot, Photographer, Polymath and Sound Engineer. He is the founder of Dion Research LLC, an R&D firm specializing in Fully Integrated Software and Hardware (www.dionresearch.com) and works as a Data Scientist at Inmar, Inc. (www.inmar.com).
He is also the founder of the local Python user group (PYPTUG), a group he founded to “promote and advance computing, electronics and science in general in North Carolina using the Python programming language.”
Detail:
Behind the scene and various aspects of electronics and computing cluster and data science in near space. A glimpse at future technology. and at the future of technology.
When Date: 12-August-2015 Time: 11:30AM to 01:30PM (2.00 hours) All times are: America/New_York
For the past several years, South East Linux Fest has been one of the conference in the southeast US that I've looked forward to. Not just for your day to day Linux admin stuff, but for a wide gamut of talks on databases, operating system design, security, programming, history, so on and so forth. In 2013, we started to see some Raspberry Pi related talks (see Alcyone And On And On - hope you get the musical reference), and in 2014 I was able to present about Python 3 in the browser (thanks to Brython). And here comes SELF 2015. I was pleasantly surprised to find out that my proposal for 2015 was not only accepted, but selected as the Saturday keynote, at 9am.
SELF Keynote
Title: "Team Near Space Circus: Computing at 80,000 ft" By Francois Dion
Description:
On April 21st 2015, Team Near Space Circus made history, sending a high altitude balloon (NSC-01) into near space. Aboard the vehicle, a computing cluster (a first for HAB flights) of 7 networked Raspberry Pi, 8 cameras and multiple sensors recorded 250GB of data, images and video.
This is the behind the scene story of this ground breaking flight.
When:
Saturday June 13th 2015, 9 AM.
P.S. This is a talk of general interest and will appeal to those who enjoy a good story.
When & Where
SouthEast|LinuxFest
June 12-14, 2015 Sheraton Charlotte Airport Charlotte, NC
In the coming days and weeks we'll publish charts and graphs, and we'll be working on images and videos. For right now though, I don't really feel like typing much, so I'll leave you with a few pictures... (Captured using a webcam and pygame and an infrared Raspberry Pi camera, all from node #7 of the compute cluster)
Infrared image, lakes are easy to see on those (and our radar reflector)
I'll be presenting at South East Linux Fest (SELF) 2014. According to the final schedule, it is tomorrow, Friday the 20th at 4PM. Right after that will be a BOF in that same room. Come by and talk *nix and Python.
Last year Python was pretty much absent from the conference. Yet it is quite focused on security, and Python is just the right tool for that kind of stuff.
Brousse
That's one project among many I'll be talking about, bringing computing education to people who are usually left without.
Keep an eye on my twitter account and this blog for details. The core of the talk is about Brython.
Video
Apparently, videos of talks will be available after the conference.
Always use a good bit of data to test your data driven apps. Don't rely only on nose testing. But where to get data? Fake it. Never underestimate the power of import random. But when you need more than numbers:
What it does: Although you could use real data, sometimes you don't have any. In fact, more than likely you probably wont be able to generate a significant amount of data for weeks after going live with your web application. Or perhaps it is a desktop application and you'll never see the generated data. So just fake it. You need volume, and it's easy to create.
Another point to keep in mind is that using real data might be risky, depending on what it is. For sure you do not want real credit card numbers floating around on development instances.
Today's tip is quite basic, but will require time and effort to master:
Master the shell environment
What it does: Mac, Windows, Linux, BSD or Unix (or even something else). Whatever your operating system, become really good at using the command line, the shell. Bash, Powershell, ksh93 etc. Learn it. Else, it's like learning a bunch of words in a new language, but never learning the correct constructs. You might be able to communicate, but it'll never be very efficient. So go and find tutorials.
And then find the tools that'll make your life easier. For example, *nix users, are you familiar with autojump (plus it's written in python)? Windows users, did you know there is an equivalent Jump-Location for powershell?
Doing a quick lightning talk tonight for PYPTUG @WFU and it wont be about spanners and food as the title might imply, but about...
Functional programming
Yes, about functional programming in an imperative by design language (Python). And it's a lightning talk, so it'll be very superficial. But hopefully interesting nonetheless.
My first experience with pure functional programming was in the 80s with the Miranda programming language. If one counts impure functional such as LISP, scheme etc, then that would be Logo as my first.
Back to Python
One of the most appealing Pythonic technique I use on a regular basis, is functional. But more on that later. Here in this talk, I'll focus on a new module called Toolz (the result of a merge of functoolz and itertoolz - with a z, not the standard library functools and itertools).
Am I throwing together random words for titles now, in a weird captcha induced moment? No, just condensing my interest in lasers in a few words.
You might have seen the laser digitizer in Tron: Legacy
However, in my case, what triggered my interest in laser, was the original Tron laser digitizer
A few years later, I had the chance to play with a good old HeNe red laser, pumping a mighty 5mW (well, in the 80s, it was impressive) in the college lab. One of the things I did with it was to draw Lissajous figures (or curves) on a wall (a large wall outside, at night - even cooler), using two little speakers and mirrors I had brought (the lab was set up to only do prism and mirror experiments).
Googling, I see a nearby school (Appalachian State) has one such kit in their physics dept:
http://www1.appstate.edu/dept/physics/demos2/oscillations/3a80_40.html
Anyway, fun stuff, making math and physics a lot more interesting...
Electronica
There was the artistic connection that also further fueled my interest in lasers. There is a lot to talk about here, since I've composed and performed electronic music for many years (still write some) and hosted a radio show for about 10 years etc, so that'll be for another time.
I will bring up one point right now though: you cant talk about lasers in music shows, without mentioning Jean Michel Jarre.
Jean Michel Jarre, Houston TX 1986
From his incredible live outdoor shows with lasers, lights and fireworks (one, a tribute to oceanographer Cousteau, had an attendance of over 2 million people in France in 1990) to his laser harp. Jarre without lasers wouldn't be the same.
On the road
The Raspberry Pi has a lot of appeal by itself, but I figured that it would probably be a good idea to add a laser in the mix. Since I had a presentation at PyCarolinas, I figured I'd write a script with Python (laserpulse.py, hence the title of this article) and build a little rig to project interesting patterns on the wall behind me.
So, using a laser in presentations, does it work? Well, at PyCarolinas, I got a lot of feedback on this, both during the presentation, after the presentation and even during other talks (heard during another talk "so we've learned today that lasers are cool")
In the audience: "I just want to say that this is the coolest command, ever."
And so on and so forth. The conclusion is this: Science needs some showmanship. But please, be careful when playing with lasers!
Video
So I'll leave you with a video of my little rig above controlled by the Raspberry Pi, going to the music of a very British band, doing a cover of the theme of a very British TV show. Very apropos, since the Raspberry Pi is a very British computer, afterall.
Youtube video (Music by Orbital, Doctor? live at BBC)
One feature that has contributed to the Raspberry Pi's success is the possibility of interfacing the virtual (software) with the physical world. We are speaking of course of the "General Purpose Input and Output" pins, more commonly known as GPIO. They are available on the Pi at the P1 connector.
Making a GPIO cable
We are doing some green computing today! We are going to recycle some old computer cable and make a Raspberry Pi interface cable with it. We will then be able to connect motors, LEDs, buttons and other physical components.
The interface cable is a flat cable with 26 conductors. It is quite similar to an IDE hard disk cable (or ATA) with 40 conductors (avoid the ATA66/133 with 80 conductors):
Original IDE/ATA cable with 40 wires
Let's get to work
We will only need 2 connectors on our interface cable and not 3. With a cable with 3 connectors, we just need to cut one section off.
Cutting it with a pair of scissors
Before doing the cutting, we will do some marking. The reason is that we only need 26 wires, and we have 40. With the red wire on the left, we have to count 26 wires and mark the split with a fine permanent marker. We count on the right side to make sure we do have 14 conductors, not a single one more or less.
Using a permanent marker to write
We are going to divide the cable in two parts, using an x-acto style knife or scalpel: one with 26 wires and one with 14 wires.
splitting in two parts
We then have to cut one section of the connectors off, with a small saw, such as a metal saw (or a Dremel style cutting wheel).
we need to cut on the 7th column from the right
We remove the top part, then the cable section with 14 wires, and finally, after notching it, we remove the bottom part.
almost done, just remove the part on the right
We are done with the cutting. We can now connect the cable to the Raspberry Pi. The red wire is closest to the SD card side, and farthest from the RCA video out (yellow connector):
Connections
With the cable ready, we are now going to test it. Let's add 2 LEDs to do this test. We will use a red and a green LED, but you can use amber or yellow LEDs too. Blue, violet or white LEDs will not work, since they need more voltage.
The connection is really simple:
Red LED and green LED, short leg -> third hole on the left.
Red LED, long leg -> second hole on the right
Green LED, long leg -> third hole on the right
Python Code
First thing first, you have to get the
RPi.GPIO Python module. It is a module that will allow us to control the GPIO of the Raspberry Pi. On Raspbian, it is now included, but for another version of Linux, it can be installed with
sudo easy_install RPi.GPIO
Or through apt-get (or equivalent package manager):
$ sudo apt-get install python-rpi.gpio
Here is the code:
#!/usr/bin/env python
""" Setting up two pins for output """
import RPi.GPIO as gpio
import time
PINR = 0 # this should be 2 on a V2 RPi
PING = 1 # this should be 3 on a V2 RPi
gpio.setmode(gpio.BCM) # broadcom mode
gpio.setup(PINR, gpio.OUT) # set up red LED pin to OUTput
gpio.setup(PING, gpio.OUT) # set up green LED pin to OUTput
#Make the two LED flash on and off forever
try:
while True:
gpio.output(PINR, gpio.HIGH)
gpio.output(PING, gpio.LOW)
time.sleep(1)
gpio.output(PINR, gpio.LOW)
gpio.output(PING, gpio.HIGH)
time.sleep(1)
except KeyboardInterrupt:
gpio.cleanup()
Just save the code into a file named flashled.py.
PINR is the GPIO for the red LED (0 for Rpi V1 and 2 for V2)
PING is the GPIO for the green LED (1 for Rpi V1 and 3 for V2)
We select the Broadcom mode (BCM), and we activate the 2 GPIO as output (OUT). The loop will alternate between the red LED on / green LED off during 1 second, and red LED off / green LED on during one second ( time.sleep(1) ). By doing a CTRL-C during execution, the program will terminate after cleaning up after itself, through the gpio.cleanup() call.
Running the code
Usually, a LED should be protected with a resistor to limit the current going through it, but in this case it will blink intermittently, just to test the cable, so we don't need any.
For continuous use, it is recommended to put a resistor in series (about 220 Ohm to 360 Ohm).
Before we can run the program, we have to make it executable:
$ chmod +x flashled.py
$ sudo ./flashled.py
CTRL-C will stop the program.
Red LED
Green LED
This concludes our article. I hope it was satisfying making this cable and recycling at the same time.
The follow up to this will be about making our own breadboard adapter, and having fun with a transistor.
I'll be giving the talk "Raspberry Pi: From Kindergartners To Mad Scientists" at ForsythTech in Winston Salem, NC this coming Monday at 5:30pm (Feb 4th), and Thursday at 3pm (Feb 14th). It will be in the Hauser building, room 332.
Old school flyer
Just got a copy of the flyer:
It's that $25 computer talk
Each time I give this talk somewhere, it ends up quite different from the previous one. In part due to questions, and in part because I adjust the content to the target audience.
Should be fun, there is interest from several programs, so I'll cover a wide range of material.
Learn Python. Starting at the beginning... Note: This is a translation of the popular basic guide to Python on the Raspberry Pi: Python sur RaspberryPi 01 (in french), adapted where it made sense.
For the experts, there's not much to see in this article. For everybody else who keep seeing references to Python in your research on the Raspberry Pi, and wondering why they keep talking about a snake, then this article will be perfect for you. Or perhaps, you do know that Python is a programming language.
Welcome to our series of tutorials on the Python programming language, with a specific application to the Raspberry Pi.
Sites
Before we get too far, I want to provide you with a few basic links. The first is to python.org.
If you spoke another language, such as Spanish, Portuguese, French, Italian or Russian, I would point you to several other websites, because for other languages, individuals tend to have more complete sites than the official one (or, I should say, more in tune with the culture). But since you are reading this in English, python.org will become your primary stop.
There, you will discover regional user groups, documentation, downloads, mailing lists, a wiki and eventually, a Jobs section. You can't get enough of documentation? Then readthedocs.org
If you are the visual type, check out also pyvideo.org
Books
In english, there are lots of choices. In fact, too much. For example, on Amazon, you'll find over 1000 books on Python. Even checking out only hardcovers, you'll still end up with over 85 books!
If you live in a big city, it is a good idea to go to your local book store, and check out what books they have. As I've taught some Python to others, and recommended some books based on their personality, I've noticed that almost everybody is different. One book that I like, you might hate, and vice versa, because we are all different.
So you'll have to dig and see what book appeals to you, based on styles that vary from "Python for Kids", "Hello Python" and "Head on Python" to "Core Python Applications Programming" or even a "Python Essential Reference". There are many textbooks available too, some are assigned reading material for Python classes in colleges and universities, worth your time to check them out.
Free eBooks
I won't hold back in recommending some books in this section. Considering how much you will pay for them... the value to cost ratio is difficult to beat :)
Green Tea Press has several free ebooks. These are also available in print and eBooks from OReilly, for a fee. Among them is the classic text (or I should say the new edition of the classic text):
Have you found a free ebook that should be mentionned here? If so, please leave a comment!
Python, the program
So, Python is a programming language, but it is also an executable program. It is included with the Raspberry Pi and with most Unix type systems (Solaris, OpenIndiana, Mac OS/X and the varieties of Linux and BSDs ). It can also be installed on Windows,
iPhone, iPad, Android etc. In this article, I will differentiate the python program from the language, by writing it in bold characters.
It is possible to use python in various ways or mode of operations.
First mode
In an interactive way, from the command line:
pi@raspberrypi ~ $ python
Python 2.7.3rc2 (default, May 6 2012, 20:02:25)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
The python shell is now waiting for a command, to have something to do. We will ask it to print the words "Raspberry Pi!". In the same way that I had to put quotes around the words to clearly differentiate what I want to print from the rest of the text, in Python it is required to delimit a string of characters (a sentence) with simple (') or double (") quotes on each side of this string:
pi@raspberrypi ~ $ python
Python 2.7.3rc2 (default, May 6 2012, 20:02:25)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print "Raspberry Pi!"
To simulate this in your browser (with Brython), click
python answered us:
Raspberry Pi!
If our string of characters extends beyond one line, then we will have to use the single or double quote symbol, three times on each side ("""I'm writing a lot of words, so you better be ready for me with a multiline string, since I will go on and on and on.""")
pi@raspberrypi ~ $ python
Python 2.7.3rc2 (default, May 6 2012, 20:02:25)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print """
... Raspberry Pi!
... I continue.
... I am done!
... """
Raspberry Pi!
I continue.
I am done!
python answered us:
Raspberry Pi!
I continue.
I am done!
However, in our case, we are not done at all, we are barely starting! It is as good a time as any to mention at this point that a string that starts and ends with the triple quote ("""string of characters""") by itself, without instructions, is what is called a docstring, a string of characters (or sentence) for documentation purpose. A form of commentary, in other words.
We can also use python like a calculator:
pi@raspberrypi ~ $ python
Python 2.7.3rc2 (default, May 6 2012, 20:02:25)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 2 * 32
64
To exit the python program, we must type the commant: exit()
The word exit, followed by a pair of parenthesis, indicates that exit() is a function. We will come back to functions later on (and we will learn that print is a strange animal, that should be written print(), but that's for another time).
Second mode
If our code is quite short, there are no issues with using the interactive mode. But it might become burdensome to write the same code again and again. It's a good thing then that we can save our code in a file, whose name will end with the .py extension and that we can execute again and again.
Let's save our program that prints "Raspberry Pi!" in a file. In order to do that, we will need an editor. For the experts, I would recommend something like scribes in graphical mode and vim in console mode (text).
Since we are starting out, I would recommend instead to install geany:
pi@raspberrypi ~ $ sudo apt-get install geany
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
geany-common
Suggested packages:
doc-base
The following NEW packages will be installed:
geany geany-common
0 upgraded, 2 newly installed, 0 to remove and 134 not upgraded.
Need to get 3,401 kB of archives.
After this operation, 8,682 kB of additional disk space will be used.
Do you want to continue [Y/n]? y
Get:1 http://mirrordirector.raspbian.org/raspbian/ wheezy/main
geany-common all 1.22+dfsg-2 [2,336 kB]
Get:2 http://mirrordirector.raspbian.org/raspbian/ wheezy/main
geany armhf 1.22+dfsg-2 [1,065 kB]
Fetched 3,401 kB in 6s (518 kB/s)
Selecting previously unselected package geany-common.
(Reading database ... 91199 files and directories currently installed.)
Unpacking geany-common (from .../geany-common_1.22+dfsg-2_all.deb) ...
Selecting previously unselected package geany.
Unpacking geany (from .../geany_1.22+dfsg-2_armhf.deb) ...
Processing triggers for hicolor-icon-theme ...
Processing triggers for man-db ...
Processing triggers for menu ...
Processing triggers for desktop-file-utils ...
Setting up geany-common (1.22+dfsg-2) ...
Setting up geany (1.22+dfsg-2) ...
Processing triggers for menu ...
pi@raspberrypi ~ $ geany raspberry.py
We just launched our editor, with a filename of raspberry.py. We type our code in the window raspberry.py:
Geany code editor
We save, and we quite (for right now, so as to keep things simple).
How can we run our raspberry.py script? Quite easily:
This is the second mode of operation, python running a script.
Third mode
The third mode is available on Unix type computers, and as such, on the Raspberry Pi. Let's bring up our geany code editor once more:
pi@raspberrypi ~ $ geany raspberry.py
We insert a new line 1
We save after adding the new line, and we quite. The line we've just added to the script tells the operating system shell which program will run this code. In our case, we specify python.
That is not all, however. We also have to change the file from a document mode, to an executable mode. We do this through the chmod command:
Wouldn't it be nice if we didn't have to leave the comfort of our editor each time we wanted to run the program after making a change? We can do this through the use of the F5 key, or through the Build->Execute menu, or even using the button with the gears (View or Execute):
Raspberry Pi! - press return to exit that screen
This concludes our first basic tutorial on Python on the Raspberry Pi. I hope that this was sufficient to get you started.