(based on actual conversations I have had with domain scientists...)
Solving a scientific problem? I can help!
Is it a partial differential equation? I know something about that. I've solved Schrödinger's equation in many forms. Advection and diffusion I know well. Choose your finiteness: finite elements, finite differences, or finite volumes? Each one has its own advantages and pitfalls.
Linear algebra? Use an optimized BLAS library, not the reference library! Our vendor has a group of applied mathematicians who developed these math libraries just for you. Don't let all their effort go to waste! Eigenvalues -- you don't need to orthogonalize at every step, just every so often. Maybe it will require a few more iterations, but the iterations will be so much cheaper...
Optimization? Yes! Finally something I am an expert in! Okay, so what type of problem are you solving? Is it convex? Is the objective function cheap to evaluate? How many parameters? Is it a mixed-integer program? Actually, I think your problem could be reformulated as a combinatorial optimization problem and we could try an evolutionary algorithm...
Oh, you meant code optimization. Yes, I can do that too. See these loops? Since it's Fortran, we need to reorder the loops so that the array is accessed by its first index in the innermost loop, not its second. See, we just got a 4x performance boost just by swapping these indices. Okay, let's change this array to a function, because it is easily calculated and the function can be inlined whereas that memory lookup can't. And let's eliminate these "if" statements by dividing this loop into two parts.
Your job fails at 120,000 MPI processes, but not below that? What happens when it fails? How long does it take to get to the point of failure? What is the code doing when it fails? Does it always happen in the same place? Let me ask my sysadmin colleague to look in the top-secret log files that only they can view...
Does this computation depend on that one, or can it be done in any order? Is there any reason to keep this data after initialization? What if we used this framework? Asynchronously spawning new tasks would implicitly load balance this algorithm. Your algorithm is not scalable. The synchronizing you do will be a huge bottleneck as you scale up. It may work okay now, but at the petascale or beyond there will be scalability issues coming out of the woodwork, things even beyond this synchronization issue. Trust me, I have seen this happen even in my own codes. Let my pain be your gain!
I may not be a scientist in your field, but I know a lot of things that can help you. I've boosted the performance of codes by a factor of two with a single keystroke. I may not have seen your problem before, but I've seen something like it.
Showing posts with label science. Show all posts
Showing posts with label science. Show all posts
Sunday, September 25, 2011
Tuesday, September 21, 2010
Adventures in Little Things Making a Big Difference
So, at work I work with some scientists in a field that rhymes with shmooclear shmisics. These people are very smart application scientists, but definitely not computer scientists. I love them because as long as they exist, I will always have a job. They write pretty miserable code, because that's really not their thing. They just want to do the science.
Half of my job with them is improving their code, but more importantly (if I want my efforts to not be wasted), I spend a lot of time developing a working relationship with them. You see, they are a kind of insular community, and don't generally trust some "hot shot" outsider who doesn't know the science. But I have been able to do a few simple things that have drastically improved the performance of their codes, so I think we are getting somewhere.
Most recently, I was profiling one of their codes, and discovered that it spent more than 50% of its time sorting. I looked at their sorting algorithm and saw that it was some homebrew sorting algorithm that was kind of like bubble sort (with computational complexity order N2, or the worst possible performance without doing something completely stupid). My guess is, they didn't know that some smart computer scientists had thought a lot about sorting algorithms and developed smart ways to sort; they probably just thought of how they would sort things and implemented that. So I replaced their Frankenstein sort with a heapsort algorithm (worst-case complexity N log N) and the sorting became an insignificant portion of the total runtime. Then, I showed my primary collaborator what I had done, and they discussed it in a meeting the next week. As it turned out, nobody knew why it was sorting; it was some legacy of an abandoned algorithm. I removed the sorting altogether and am in the process of doing a little benchmarking study.
It was pretty amazing, though, that this piece of code that nobody realized was being executed was taking up more than 50% of the test problem runtime, and more than 20% of the benchmark problem runtime!
The next bottleneck in their code is the I/O. They are reading the input in a very unintelligent way (all the processors opening the same file and reading it), so I plan to fix that for them when I get the chance.
Half of my job with them is improving their code, but more importantly (if I want my efforts to not be wasted), I spend a lot of time developing a working relationship with them. You see, they are a kind of insular community, and don't generally trust some "hot shot" outsider who doesn't know the science. But I have been able to do a few simple things that have drastically improved the performance of their codes, so I think we are getting somewhere.
Most recently, I was profiling one of their codes, and discovered that it spent more than 50% of its time sorting. I looked at their sorting algorithm and saw that it was some homebrew sorting algorithm that was kind of like bubble sort (with computational complexity order N2, or the worst possible performance without doing something completely stupid). My guess is, they didn't know that some smart computer scientists had thought a lot about sorting algorithms and developed smart ways to sort; they probably just thought of how they would sort things and implemented that. So I replaced their Frankenstein sort with a heapsort algorithm (worst-case complexity N log N) and the sorting became an insignificant portion of the total runtime. Then, I showed my primary collaborator what I had done, and they discussed it in a meeting the next week. As it turned out, nobody knew why it was sorting; it was some legacy of an abandoned algorithm. I removed the sorting altogether and am in the process of doing a little benchmarking study.
It was pretty amazing, though, that this piece of code that nobody realized was being executed was taking up more than 50% of the test problem runtime, and more than 20% of the benchmark problem runtime!
The next bottleneck in their code is the I/O. They are reading the input in a very unintelligent way (all the processors opening the same file and reading it), so I plan to fix that for them when I get the chance.
Monday, February 22, 2010
Violins and Fungi
Did you ever wonder what beautiful violin music and fungi had in common? No? Why not?
As it turns out, scientists treated wood with fungi to see if they could create wood similar to the wood that Stradivarius used to create the violins that today go for millions of dollars. Please read this article about it, written by a high school student for more details!
As it turns out, scientists treated wood with fungi to see if they could create wood similar to the wood that Stradivarius used to create the violins that today go for millions of dollars. Please read this article about it, written by a high school student for more details!
Tuesday, June 09, 2009
Attention: Breadwinning Moms
Are you a sucker for being part of Science? I know that I am, which is why when I found out about the Bread and Roses project, I had to join. Sociologist Andrea Doucet is doing research on families in which women are the primary breadwinners. There's also a forum on the site where you can post your story or engage in discussion about life as a breadwinning mom. Head on over to the site and make your contribution to science!
Saturday, May 31, 2008
Career Day
Thanks to everybody for all your advice on career day. I gave a 30-minute presentation full of slides with interesting pictures. I began by asking who liked math. I got a show of a few hands. Then I said, "I'm glad that some people here like math. But for the rest of you, I have bad news. Any career is going to involve math to a greater or lesser degree. And generally speaking, the more math that's involved, the more money you'll make."
I had some pictures from popular television shows to get them interested. I asked them if anybody watched "Gray's Anatomy." There was a show of a few hands. "Well," I said. "What if you're a doctor and you prescribe 200 mg of a very potent medicine instead of 200 µg?"
I also had a picture from the show "CSI." Again, I asked if anyone liked that show. And then I asked what would happen if you were working on the very last DNA sample and you added 5 mL of solvent when you meant to add 50 µL? In both of these cases, knowledge of math is vital for job success.
Then I talked about my job. I told them about our supercomputers, giving the really cool numbers about how many flops* the machines do; our huge, expensive cooling system (capable of cooling 640 large houses); how much our power bill is ($5-7 million/year), etc.
And I told them about the science. I showed some pretty pictures of various applications, starting off with combustion. I asked who got to school today thanks to the power of internal combustion. There was some confusion, but after it was established that I was talking about engines, just about everyone raised their hands. Then I asked who had heard their parents complaining about the high price of gas lately. Everyone raised their hands for that one. Well, I said, that's because some of the best cars, such as mine, get maybe 30-40 mpg. But wouldn't it be cool if we could get more like 300 or 400 mpg? That's why we study combustion.
I ended the presentation by talking about my educational background and then what they could do if they were interested in a career like mine. I told them what sort of educational activities they should do but above all encouraged them to be persistent and don't let other people discourage them. I also showed a slide with pictures of some of the youngest and most attractive people I work with. In addition to being more visually appealing, they are more diverse, and it is part of my mission as a member of an underrepresented group in computer science to encourage students from underrepresented groups to join us. (I showed pictures of three people, two of whom were women, and two of whom were African-American.)
After my presentation was over, I got quite a few good questions. One joker asked something about my advanced age, but otherwise the students were genuinely curious. I felt that career day was a success and I'm grateful for all your advice.
* A flop (in addition to being a bad joke that nobody laughs at)** is a FLoating-point OPeration -- basically, any arithmetic operation involving numbers with decimal points, such as 1.1+1.1. Our big machine does 263 teraflops per second, or 263 trillion floating point operations per second. If everyone in the world were capable of doing one floating point operation per second, and we all worked together, it would take us nearly half a day to do what it takes this machine one second to do.
** Standard leadership computing facility tour guide joke.
I had some pictures from popular television shows to get them interested. I asked them if anybody watched "Gray's Anatomy." There was a show of a few hands. "Well," I said. "What if you're a doctor and you prescribe 200 mg of a very potent medicine instead of 200 µg?"
I also had a picture from the show "CSI." Again, I asked if anyone liked that show. And then I asked what would happen if you were working on the very last DNA sample and you added 5 mL of solvent when you meant to add 50 µL? In both of these cases, knowledge of math is vital for job success.
Then I talked about my job. I told them about our supercomputers, giving the really cool numbers about how many flops* the machines do; our huge, expensive cooling system (capable of cooling 640 large houses); how much our power bill is ($5-7 million/year), etc.
And I told them about the science. I showed some pretty pictures of various applications, starting off with combustion. I asked who got to school today thanks to the power of internal combustion. There was some confusion, but after it was established that I was talking about engines, just about everyone raised their hands. Then I asked who had heard their parents complaining about the high price of gas lately. Everyone raised their hands for that one. Well, I said, that's because some of the best cars, such as mine, get maybe 30-40 mpg. But wouldn't it be cool if we could get more like 300 or 400 mpg? That's why we study combustion.
I ended the presentation by talking about my educational background and then what they could do if they were interested in a career like mine. I told them what sort of educational activities they should do but above all encouraged them to be persistent and don't let other people discourage them. I also showed a slide with pictures of some of the youngest and most attractive people I work with. In addition to being more visually appealing, they are more diverse, and it is part of my mission as a member of an underrepresented group in computer science to encourage students from underrepresented groups to join us. (I showed pictures of three people, two of whom were women, and two of whom were African-American.)
After my presentation was over, I got quite a few good questions. One joker asked something about my advanced age, but otherwise the students were genuinely curious. I felt that career day was a success and I'm grateful for all your advice.
* A flop (in addition to being a bad joke that nobody laughs at)** is a FLoating-point OPeration -- basically, any arithmetic operation involving numbers with decimal points, such as 1.1+1.1. Our big machine does 263 teraflops per second, or 263 trillion floating point operations per second. If everyone in the world were capable of doing one floating point operation per second, and we all worked together, it would take us nearly half a day to do what it takes this machine one second to do.
** Standard leadership computing facility tour guide joke.
Tuesday, May 06, 2008
In Which I Solicit Advice from My Vast Readership
O Vast Readership, maybe some of you will be able to help me figure out what I should do with the latest opportunity I have been afforded!
Later this month, I will be speaking at my local middle school's career day. My topic is "Math Careers," and I have a half hour allotted to me during each math period. My target audience is 8th graders (13-year-olds), I believe, but the school goes from 5th-8th grades.
I'd like to engage them and keep them interested in what I'm talking about. I could drone on and on about things that they will probably find boring, but what I was thinking was that quite honestly, every job they will ever have will involve math to a greater or lesser degree. I thought I might inform them of that fact, give some examples (e.g., cooking, which requires knowledge of fractions, multiplication, division, addition, and subtraction) before proceeding on to more math-based careers (such as scientist, engineer, computer scientist, and mathematician). But I'm not really sure what specifically to say and I'm not sure how I'm going to fill a half-hour, or maybe let's say 25 minutes with five minutes for questions. ;)
This is where the expertise of my vast readership comes in. I was an atypical 13-year-old in that I would have actually been fascinated by anything an actual mathematician had to say, so I am lacking a certain perspective. Any suggestions?
Later this month, I will be speaking at my local middle school's career day. My topic is "Math Careers," and I have a half hour allotted to me during each math period. My target audience is 8th graders (13-year-olds), I believe, but the school goes from 5th-8th grades.
I'd like to engage them and keep them interested in what I'm talking about. I could drone on and on about things that they will probably find boring, but what I was thinking was that quite honestly, every job they will ever have will involve math to a greater or lesser degree. I thought I might inform them of that fact, give some examples (e.g., cooking, which requires knowledge of fractions, multiplication, division, addition, and subtraction) before proceeding on to more math-based careers (such as scientist, engineer, computer scientist, and mathematician). But I'm not really sure what specifically to say and I'm not sure how I'm going to fill a half-hour, or maybe let's say 25 minutes with five minutes for questions. ;)
This is where the expertise of my vast readership comes in. I was an atypical 13-year-old in that I would have actually been fascinated by anything an actual mathematician had to say, so I am lacking a certain perspective. Any suggestions?
Tuesday, February 19, 2008
Growing the "Computity"
Something fun that I get to do in my job is give tours. My boss doesn't want me to do more than one a week, and because of his prohibition, I give tours infrequently enough that they are fun every time. I give a 15-minute spiel on the "observation deck" overlooking our machine room, before escorting them upstairs to the visualization lab.
There are some standard tour guide tricks that I perform. If you tour any cave, there are standard cave tour guide jokes (such as the wishing rock... the rock you wish you hadn't hit your head on), and likewise there are standard leadership computing facility tour guide jokes.
I make the jokes to keep people awake and interested. But I hope that I do more than provide light entertainment to all our visitors, but especially the students.
In particular, I hope that my words reach deeper than a light-hearted tickling of their funny-bones. I hope that some of the students who visit come away with new ideas about their futures. I hope that they discover that supercomputing is a fascinating field. I hope they can see all the things I love about my job, and seriously consider a career in high-performance computing. I hope that they can see that scientists are normal folks with people skills and good senses of humor.* I hope that I can be a role model, to girls in particular, who can remember me as a counterexample when people tell them (directly or indirectly) that science is not for them.
Even if they don't remember me later in life, I hope that I have planted a seed in their minds, and that someday, some of these children grow up to be computational scientists. The "computity" needs new members!
* Nerd joke: How do you know that you're talking to an extroverted {mathematician, computer scientist, physicist}? Because the {mathematician, computer scientist, physicist} is looking down at your shoes rather than his or her own while talking to you.
Another good (but only tangentially related) joke: How do you know that you're dealing with the mathematics mafia? Because they make you an offer you can't understand.
scientiae-carnival
There are some standard tour guide tricks that I perform. If you tour any cave, there are standard cave tour guide jokes (such as the wishing rock... the rock you wish you hadn't hit your head on), and likewise there are standard leadership computing facility tour guide jokes.
I make the jokes to keep people awake and interested. But I hope that I do more than provide light entertainment to all our visitors, but especially the students.
In particular, I hope that my words reach deeper than a light-hearted tickling of their funny-bones. I hope that some of the students who visit come away with new ideas about their futures. I hope that they discover that supercomputing is a fascinating field. I hope they can see all the things I love about my job, and seriously consider a career in high-performance computing. I hope that they can see that scientists are normal folks with people skills and good senses of humor.* I hope that I can be a role model, to girls in particular, who can remember me as a counterexample when people tell them (directly or indirectly) that science is not for them.
Even if they don't remember me later in life, I hope that I have planted a seed in their minds, and that someday, some of these children grow up to be computational scientists. The "computity" needs new members!
* Nerd joke: How do you know that you're talking to an extroverted {mathematician, computer scientist, physicist}? Because the {mathematician, computer scientist, physicist} is looking down at your shoes rather than his or her own while talking to you.
Another good (but only tangentially related) joke: How do you know that you're dealing with the mathematics mafia? Because they make you an offer you can't understand.
scientiae-carnival
Sunday, February 17, 2008
Things I Love about My Job
I love the work I do for a living. Okay, so the excessive number of PowerPoint presentations, Excel spreadsheets, and Word documents I have to create and/or edit in order to impress Important People -- I could do without. But the science -- that is exciting stuff!
In my job, I work with power users of our supercomputers to get their codes up and running on the big machines. Depending on what the project PI wants, I just help them get started, or I get deeply involved as a member of the development team, or anything in between. These are people who have allocations of millions of CPU-hours, and run big codes that simulate scientific processes that are generally either impossible or too expensive or dangerous to do in the lab.
For example, we have users who simulate supernovae. As I say when I give tours, you can't simulate a supernova in the lab, or if you did, no one would live to tell about it. You also can't go check one out "in the field," because it's (hundreds or thousands of) light-years away and by the time you got there a) you'd be dead, and b) the event would be over. Furthermore, even if you did get there in time, a supernova is... shall we say... inhospitable to human life. So the only thing that astrophysicists can do is take the observations they can make from earth and near-space, combined with their knowledge of the laws of physics, and simulate supernovae on a computer. And there is so much physics involved that these computations require the use of thousands of CPUs for days at a time.
I don't work with the astrophysicists; I work with chemists and nuclear physicists. The nuclear physicists are my new project, so I don't know that much about what they do. But I do know what the chemists are doing, because I've been working with them since I came here as a postdoc.
One of the things they're studying is catalysis. A catalyst is a substance that speeds up a chemical reaction but is not used up by the chemical reaction process. The production of ninety percent of commercially-produced chemicals involves catalysis at some stage or another. You may have heard of the catalytic converter in your car's exhaust system, which converts toxic chemical byproducts of combustion into less toxic chemicals.
From what I understand, the discovery of most catalysts has been more-or-less serendipitous. Somebody accidentally contaminates a reaction, and discovers that the desired chemical reaction still occurs and actually goes faster! But it's inefficient, expensive, and possibly even dangerous to make discoveries in this way. If we instead simulate catalysis on a computer, we can be more systematic about it, and try a bunch of different catalyst candidates for a given reaction, without having to worry about safety or pollution. Then, we can pick the top-performing candidates, and actually try them out in the lab.
Something I really love about this job is the fact that I get to work on projects that make important breakthroughs in many different fields of science. I don't know much more than I've just described about catalysis, yet my work is instrumental in the true experts on catalysis learning even more about it.
Sometimes, the day-to-day stuff, such as tracking down a bug, or figuring out why the code doesn't compile, or why it gives incorrect answers, can be a real drag. But seeing the bigger picture is what makes all that boring stuff worthwhile.
In my job, I work with power users of our supercomputers to get their codes up and running on the big machines. Depending on what the project PI wants, I just help them get started, or I get deeply involved as a member of the development team, or anything in between. These are people who have allocations of millions of CPU-hours, and run big codes that simulate scientific processes that are generally either impossible or too expensive or dangerous to do in the lab.
For example, we have users who simulate supernovae. As I say when I give tours, you can't simulate a supernova in the lab, or if you did, no one would live to tell about it. You also can't go check one out "in the field," because it's (hundreds or thousands of) light-years away and by the time you got there a) you'd be dead, and b) the event would be over. Furthermore, even if you did get there in time, a supernova is... shall we say... inhospitable to human life. So the only thing that astrophysicists can do is take the observations they can make from earth and near-space, combined with their knowledge of the laws of physics, and simulate supernovae on a computer. And there is so much physics involved that these computations require the use of thousands of CPUs for days at a time.
I don't work with the astrophysicists; I work with chemists and nuclear physicists. The nuclear physicists are my new project, so I don't know that much about what they do. But I do know what the chemists are doing, because I've been working with them since I came here as a postdoc.
One of the things they're studying is catalysis. A catalyst is a substance that speeds up a chemical reaction but is not used up by the chemical reaction process. The production of ninety percent of commercially-produced chemicals involves catalysis at some stage or another. You may have heard of the catalytic converter in your car's exhaust system, which converts toxic chemical byproducts of combustion into less toxic chemicals.
From what I understand, the discovery of most catalysts has been more-or-less serendipitous. Somebody accidentally contaminates a reaction, and discovers that the desired chemical reaction still occurs and actually goes faster! But it's inefficient, expensive, and possibly even dangerous to make discoveries in this way. If we instead simulate catalysis on a computer, we can be more systematic about it, and try a bunch of different catalyst candidates for a given reaction, without having to worry about safety or pollution. Then, we can pick the top-performing candidates, and actually try them out in the lab.
Something I really love about this job is the fact that I get to work on projects that make important breakthroughs in many different fields of science. I don't know much more than I've just described about catalysis, yet my work is instrumental in the true experts on catalysis learning even more about it.
Sometimes, the day-to-day stuff, such as tracking down a bug, or figuring out why the code doesn't compile, or why it gives incorrect answers, can be a real drag. But seeing the bigger picture is what makes all that boring stuff worthwhile.
Saturday, November 17, 2007
Judgment Day Program
While I was on travel, Nova had a show about the Dover, Pennsylvania Intelligent Design trial. I didn't get to watch it when it aired on television, but I did watch it this evening on my computer. I really enjoyed the program, and gained perspective about the trial and the situation that precipitated it.
It was obvious that the purpose of the members of the school board was to introduce this religiously-based "theory" into the classroom, and I am glad that Judge Jones, a George W. Bush appointee, saw through their shenanigans and ruled against teaching "Intelligent Design" in the classroom.
"Intelligent design" is not science. For starters, it is not testable, it is not falsifiable, and it does not contribute anything in the way of explanations of natural processes. If it has none of the characteristics of science, it cannot be science, and it does not belong in a science class.
Of course it is not unexpected that I would not be on the side of so-called intelligent design, which is simply Biblical literalism dressed in fancy clothes and wearing lipstick. Anyone who believes that the earth is on the order of thousands rather than billions of years old lacks mathematical perspective. And anyone who thinks that we are intelligently designed has evidently never suffered from ulnar nerve entrapment. Because let me tell you, having a major nerve basically exposed, sitting right below the skin of your elbow? That's fucking moronic. Even my orthopedist -- who is not, to the best of my knowledge, omniscient -- figured out a better place to put it.
It was obvious that the purpose of the members of the school board was to introduce this religiously-based "theory" into the classroom, and I am glad that Judge Jones, a George W. Bush appointee, saw through their shenanigans and ruled against teaching "Intelligent Design" in the classroom.
"Intelligent design" is not science. For starters, it is not testable, it is not falsifiable, and it does not contribute anything in the way of explanations of natural processes. If it has none of the characteristics of science, it cannot be science, and it does not belong in a science class.
Of course it is not unexpected that I would not be on the side of so-called intelligent design, which is simply Biblical literalism dressed in fancy clothes and wearing lipstick. Anyone who believes that the earth is on the order of thousands rather than billions of years old lacks mathematical perspective. And anyone who thinks that we are intelligently designed has evidently never suffered from ulnar nerve entrapment. Because let me tell you, having a major nerve basically exposed, sitting right below the skin of your elbow? That's fucking moronic. Even my orthopedist -- who is not, to the best of my knowledge, omniscient -- figured out a better place to put it.
Labels:
all about me,
arm,
interesting,
my opinion,
science
Subscribe to:
Comments (Atom)