My job involves working with a lot of data. So, let's say, let's say you're a teacher planning a trip with your students. You record a list of their names, their addresses, their age, and maybe the phone number of their parents so that you can call them when they're in trouble. Now, what we have is called a data set. My job involves looking at that list, and then trying to see if there's anything interesting that I can obtain from it. For example, I can see how many kids we have that are aged 10, how many were born in the year 2010 or live in the same area, or something like that. So, for example, if I want to organize transport, I can say one bus goes to this area, and then maybe a car goes to this area, because I know there are fewer kids that live in a certain area. It’s all about gaining actionable insights from data.
I would say in decision making, especially in large organizations which collect tons of data about people. If you have information about your customers, you can use it to help you make decisions which benefit them. For example, packaging size. I remember when I first started working, I was living alone and I didn't have a fridge. So if I bought a litre of milk, it would go bad before I finished it. I ended up buying powdered milk instead. If the company made smaller serving sizes, I would have bought those instead. So, as a business, if you want to determine what your customers want, you could try selling different items or even different variations of the same thing. And then when you collect sales data, you can decide on a better production schedule or change how you serve different areas. A business can even turn internally and use data to better serve their employees.
Sometimes you can use data to make predictions about the future, often using machine learning models. These are models which learn from the data you provide them. So if you have a lot of data on that specific thing, you can create an automated process where the machine learns what was done in the past to predict the future. Of course, these models are not always accurate but they are very useful.
It's funny, actually. So when I was completing my undergraduate degree, my mom kept encouraging me to apply for a Master's. So, I applied for two different scholarships, right? In one case, I got into the school, but I didn't get the scholarship. In the second case, I applied for the African Institute of Mathematical Sciences. Now the funny thing with AIMSwas that when I was doing the application, I didn't know that I was applying to study Mathematical Sciences. In the section where it said, select the fields that are most interesting to you, I selected Computer Science. And when I got the acceptance letter, I thought that’s what I'd be studying.
When it was time to apply for the visa, I noticed that the university acceptance letter said Mathematical Sciences. And I remember calling them and saying, “I don't think I can do that.” The lady who I was talking to explained to me that I would be able to select the courses I was most interested in. So I thought, “Okay, I will just go. And I'll pick computer science related courses.” Unfortunately, even when it was a computer science course, you're learning about the math behind it. So it was still a lot of math for me, all because I didn’t properly research the opportunity beforehand.
We did a course in data science. And then we did a course in signal processing. In both, I got to work on a mini-data project and that’s when I realized how much I enjoy this kind of work. That's when I was like, “You know what, I think I can do this.” I also realized that the programming required for data science work is less complex than that required for software developers. And, I decided to change tracks.
Days and weeks vary. Recently, I completed a research internship. My work was on price forecasting and analysis of how market prices vary for different agriculture products. Because it was research based, you just really focus on the input that you have and what it can tell you.
Typically, on Mondays, I’d have meetings with my supervisor and my co supervisors. For this project, I had two co supervisors, and one project head. We would share progress reports on our work. I’d spend the rest of the day making a plan for myself. What are my next steps based on the outcome of the meeting discussion? If I had a backlog from the previous week, I would finish it up so that I can move on to what's next. Sometimes, I need to connect with other people to achieve my objectives. Because I was working on market analytics, some of the prices were easy to understand. But some of it was focused on West Africa, which isn’t a market I’m familiar with so I couldn't make a decision, to say, for example, this price in Senegal looks like it's wrong.
To figure it out, I check economic reports and business updates. Or you check with people from different projects or departments who are subject experts.
Once I understood the context of it all, then my actual work was to sit down in front of a computer, look at data plots and generate reports. Python is my preferred programming language.
Also, one thing that I’m trying to do more of is read some data science blogs so that I can become better at my work.
Cleaning data takes a lot of time! Like more than 40% of the project is cleaning the data. For my previous project, some of the data was collected by other people in other countries. And so it can be hard to get in touch with the people who sourced it. Language barriers also exist so it's hard. Often, you need to consult other people to make decisions on how best to clean up the data.
(Tatenda previously worked at Sandvik as a Data Scientist) I wish I was allowed to take a picture, hey, it looks really cool! Six of us share the space because it's an open office plan. There are glass walls and huge screens, showing reports, and more like graphs of the data that we collect from the core business. It’s a mining company offering mining solutions. They have machines that they send to mines and the data the machines collect is sent to us. So we have a lot of reports, mostly from Power BI and other services. It has a cool tech vibe. The average age is 23, except for my manager who is around 50. It's a very comfortable working environment where I feel like people understand each other.
That's a hard one for me, because it always depends on the organization, because you find out different things when you get to the organization. In general, the most interesting part of what I do is I get to write code that is not complex. Coming from a computer science background and expecting to become a software engineer, I was prepared for that kind of role. I did an internship once at a company where I was doing web application development, user support and network maintenance. So I had to think about the database and the user interface while writing the code that combines everything else. It was a lot. Before, I always felt like you needed to know a lot of things for you to be good at what you were doing. But now, with what I do, I can write code to automate some of the work and worry less about the aspects I don’t enjoy. This allows me to feel more comfortable and confident in my work. I feel like everyone is happier doing something that they are comfortable with and I’m happy with what I’m doing.
It's not very different. Organizations function in different ways. The first place I worked at had processes and steps I could follow about what to do with the data and what reports to write. So it was easier than what I expected because I knew what I was supposed to do. At the next organization, it was research based work and I didn’t get a lot of instructions on how to go about it. I wish I had received more guidance, especially because I felt like I needed a better understanding of the field I was working in. So at first, I didn't really understand what I was doing. But after a while I got the hang of it.
Well, it depends. So, when you're considering getting into data science, look on Linkedin. You’ll see that people come from different backgrounds. Some of them come from physics, some of them come from math, some of them from business, from manufacturing, all kinds of different industries. It's not about the background of what you studied. I think it's about solving a problem. It's about looking at the data that you're collecting and thinking “how can I use it to add value to the organization?” That's where you start. So before thinking about what skill do I need to have, you will first need to identify what problem you need to solve and then you can identify the skills needed for the situation. Maybe you need to do some data visualisation. Okay, what visualisation tools exist? Learn how to use them. Do a course in Power BI or Tableau. Maybe you need to sort out the data infrastructure. What skills are needed there? Do that. There are levels to these things. Be the kind of person who can see past the core of the work you’re doing and recognise the data insights.
Are you a finance student? An engineering student? Understand the data available in that field and how to engage with it to build your skills. Use your time as a student to try mini projects. Try stuff out and think carefully about what programming languages are specific to your future career. I realized that a lot of organizations here in Zimbabwe use Power BI and Tableau. Some statistics organizations use SPSS.
The best advice I ever got was from my brother. I was in high school and trying to decide which degree programme I would choose when I go to university. And I wanted a programme that would get me a job which pays a lot. He’s more than 10 years older than me so he and his friends were already working at the time. He told me that any profession can make a lot of money, so I should just pick a path and commit. Not that quitting wasn’t an option but that you need to give things time because the most rewarding things often take time.