What kind of mindsets are you looking for in a data scientist?
On the technical side, a data scientist should have strong aptitude in numerical and analytical skills, i.e. good knowledge in maths and statistics. Further, a data scientist should know coding. Python is usually preferred to R, and knowledge of Spark is also useful.
Why is Python preferred than R?
Generally based on IT considerations, Python is more robust and easier to produce products at industrial level. In addition, it is easier to integrate Python with different software platform and cloud sharing. In contrary, R is more frequently used in research-oriented context.
Any interpersonal/non-technical skills that you would look for from a data scientist?
Curiosity: a data scientist should be curious about business and value creation. He or she should always want to understand why, self-motivated, and always keeps learning.
Be realistic: Sometimes I see that some data scientists treat the data science project more like a Kaggle competition rather than a business project. However, the real business data are usually not clean, which sometimes limit the effectiveness of models. It takes time to acquire data with decent quality, and you should be realistic in choosing the model that best performs on your data.
Resilience: working in industry could often be challenging. You should be prepared to tackle various business problems and have the motivation to solve these problems.
Parsimonious principle of modelling: it is tempting to develop a very advanced but complicated model that perfectly solve the problem. However, if a simpler model is already good enough to explain a problem, then you should consider the simpler model. A simpler model is easier to be interpreted and communicated, thus it is easier for the stakeholders to understand or to be used to convince other people. While the above listed qualities are always sought by us, it could be difficult to find a well-rounded talent that has all these qualities. Therefore, data scientist would work as a team where each one contributes the team with his/her own unique strength.
Could you briefly introduce what’s it like to work in Argos’ data analytics team?
Data science expertise generally involves three domains: business, mathematics / statistics, and coding. Our data science team works on commercial and supply chain optimization, so we tend to focus more on the business side of the problem, but it is important to have acumen on both business and the technology. Most of our works in programming and modeling are done in Python, with data preparation done in SQL (using Microsoft SQL Server and now increasingly Snowflake since databases have been migrating to the Cloud).
Some of our tasks involve forecasting what customer will buy so that we could minimize cost and maximize sales and satisfaction. We also provide data science support in other functions such as production, customer behaviour modelling and inventory management upon request.
One nature of our tasks concerns making improvements on old methods. As Argos stores expand in numbers and become smaller, we meet the increasing demand to develop models (algorithms), such as linear programming, that optimize the space utility. The improvement process is iterative – rather than implementing a completely new model one large change, we tend to do it with multiple steps and take records when necessary. In this way, we make sure that any change added to the original can be understood by stakeholders and its impact monitored in the real world. Not everything works in practice according to our modelling assumptions and simulations, so sometimes we have to refine and adjust them after being trialed.
Data scientists can come from very different career paths. What are some of the most common career transitions you’ve seen people make into this field?
The use of data science would become boarder and more embedded in systems and reports that automate or support business decision making. Areas in predictive and prescriptive modelling will continue to expand.
The application of data science is now moving into inventory and delivery optimization, as well as store network planning, i.e. where to open store and warehouse.
While an expert’s gut feeling is important, we aim to use data science to supplement the decision-making process with more fact-based, data-supported evidence. For example, when a new product is to be launched, a market expertise may forecast the sale number based on his/her historical experience. We, on the other hand, would forecast the sale number with analytics models that draw evidence from historical sales data. This also prevents human bias to some extent.
Once we find a desired algorithm first, then will try to have it built by the IT team, so there is still a gap between thinking a model and building an actual model, i.e. the gap between the prototype and the production-level model. Part of my work has been narrowing this gap so that the model development process could be more scalable and transferable.
Is there any opportunity lying in it for students?
We are already working under the 3-way Knowledge Transfer Partnership (KTP, a government scheme), where a business collaborates with academia and an associate who receives supervision from a university. We have opened some projects for university students too, such as solving data quality issues. We are not actively recruiting at the moment, however we will explore such opportunities in the future.
These points reflect the personal views of our interviewee. Any of the discussion points does not reflect any of the organisations he has worked or is currently working at.