Friday, September 18, 2009

On Playing With Data

Recently at work I was put in the position of having some data, not much but enough to try to make some graphs, and needing to use it to understand what was going on.

More specifically, I had some dates and the mass of a thing from which we were trying to etch graphite. I wanted to find a way to express the rate of this etch both visually and comparably.

In the end, I didn't exactly reach these goals, but I did learn plenty about Playing With Data, and playing is really what it is!

In all my labs in highschool, you'd do some experiment, take some data, and then they'd have some lab questions or analysis where you'd be gently guided into seeing empirically the equations that you learned the day before. This is nice and all if you tend not to believe the teacher or textbook and you need concrete proof, or if you had a hard time internalizing and understanding the applicability of what you learned, or if you were just really interested (an interest that often went away after doing these labs."

This was quite different. I hadn't been taught the day before what was supposed to be going on (no one really knows), I hadn't learned the equations that the data should fit, and I didn't have some nice numbered questions that brought me gently to the main conclusion or point of the lab.

So I played. I fiddled around with and manipulated the data. I made tons of different graphs, and tons of different best fit lines or functions. 

The most important thing I learned, is that data without visual representation is largely meaningless and is not understandable. When you have just a list of x and y coordinates, it is very very rare that you can glance at them and go, "oh yeah, that clearly indicates this!" However, in a graphical representation of a data, one that's well done and clear, this can be done easily.

The creation of these graphs and the manipulation of this data also let me know another good lesson: more data is better. I was working with a very small amount of data, almost the minimum I could draw any reasonable conclusion from, and while I did have enough to draw such conclusions for my own purposes, it made it much harder and much less reliable. The more data you have, the easier it is to see trends and patterns, and the more exact your mathematical understanding of these will be.

Speaking of mathematics, I learned that it's very helpful to approach data from an analytical and mathematical way so as to gain and understanding of what you should do with the data. In this case we thought that the masses should change exponentially, that is Mass  = Ae^(kt), which would make sense since we assume that the longer it goes, the less graphite there is to etch, and so the rate slows.  This would indicate that if you took the natural log of the masses and graphed it against time, the relation should be linear. This we saw seemed to be true. With further data and investigation into this, we may be able to figure out a way to compare or standardize the etching rate, however I suspect that to do this we would need to look into the role of surface area on the rate.

And finally, documentation is key! in our digital world, no information need be delete (thanks, google) and everything is easily recorded and saved. So record and save it! I kept most of the graphs I made, even if they didn't make any sense, and I keep many of the pictures, even though they don't give any information towards figuring out the rate. All of it could be useful later on.

No comments:

Post a Comment