The two sides of Data Science
There is a dirty little secret you’ll learn over your Data Science career, and that is: it revolves around people!
Data Science, inherently possesses a dual nature, each side complementing and contrasting the other. On one side, we have the Human Side of Data Science, which necessitates an understanding of business dynamics, entrepreneurial thinking and solving problems by working … with people. It’s about aligning data-driven insights with organisational goals and visions, ensuring that the technological advancements are harmoniously integrated with human-centric values and objectives.
On the other side, the Science Side of Data Science delves into the meticulous methodologies, the pursuit of reproducibility, and the adherence to rigorous documentation, embodying the essence of the scientific method. It’s a realm where the philosophical foundations of science intertwine with innovative technologies, fostering a culture of inquiry, exploration, and continual learning.
Without grasping the business's revenue, cost, and profit centres, a data scientist can't truly discern their value contribution.
Indeed, there exists an unspoken truth, a revelation that unfolds gradually over the course of one’s journey in Data Science, and it is this: Data Science is not merely a realm of data and algorithms as one might initially perceive; fundamentally and unequivocally,
IT IS ANCHORED IN PEOPLE!
Data Science is fundamentally a fusion of understanding and human connection. It’s a field where human needs, desires, and interactions are interlaced with algorithms and patterns found in data.
The essence is not found in the realm of numbers, codes and algorithm but in the dynamic realm of human experience and interaction. The true potential of Data Science is unlocked not merely by delving into the data but by connecting more deeply with the people whom the data represents.
Human Side of Data Science
Understanding How the Business Makes Money
For a data scientist, adopting an entrepreneurial mindset is not just beneficial—it’s essential.
The prevalent culture often moulds individuals to be followers, but the wisdom shared by McKinsey suggests a shift in perspective. Managing a project or leading a team should be approached with the same strategic foresight and meticulousness as running a company.
This encompasses a profound understanding of the company’s vision, goals, budget, cost, and income. This insight is not just about knowing the financial dynamics; it’s about discerning where one can inject value, optimise income, and curtail losses within those dynamics. The essence of this perspective is eloquently encapsulated by a CFO’s words, "If you don't understand how the business generates money, you are missing the plot on where or how your contributions can add value."
Without a basic understanding of the parts of the business that generate revenue, profit and costs - it is impossible for a data scientist to fully appreciate how they add value.
Defining the Problem
The journey of data science begins long before the exploration of data—it starts with defining the problem. This step is the compass that navigates the entire data science process, ensuring every analytical endeavour is tethered to real business needs. It’s not just about identifying what the problem is; it’s about understanding its nuances, its impact, and its relevance within the business context. A well-articulated problem statement is the foundation upon which the right data can be identified, appropriate methodologies can be selected, and meaningful, actionable insights can be derived.
Commercial Value Proposition
The intersection of data science and business is a complex landscape, and within this, articulating the commercial value proposition is a pivotal task. It’s about translating intricate analytical insights into clear, tangible business value. It’s about ensuring that the fruits of data science initiatives are not just technically sound but are also aligned with the strategic imperatives of the organisation. This alignment is crucial for the initiatives to resonate with stakeholders and to contribute substantively to the commercial trajectory of the organisation.
Iterative Stakeholder Delivery
Data science is not a realm of isolated endeavours; it’s a dynamic, iterative journey involving continuous collaboration with stakeholders. The delivery of insights is a cyclical process of refinement, a dialogue between data scientists and stakeholders to fine-tune solutions, ensuring their relevance, applicability, and impact. This iterative engagement is crucial for the evolution of solutions that are not just analytically robust but also business-centric, maximising the transformative impact of data science on decision-making and operational strategies.
Reflection:
The human side of data science is a multifaceted domain, intertwining technical acumen with a nuanced understanding of business intricacies. It’s about harmonising analytical rigour with organisational insights, navigating the multifarious dimensions of business with strategic acuity and empathetic understanding. It’s about moulding solutions that are reflective of both the analytical depth and the commercial realities of the organisation.
Science Side of Data Science
The science in data science is a meticulous pursuit, deeply rooted in the philosophy of science, focusing on the reproducibility, documentation, and validation of results. Here’s a refined exploration into the scientific dimensions of data science, each related to the philosophy of science and the scientific method:
One Pager:
The one pager succinctly articulates the essence of the project, outlining objectives, methodologies, and expected outcomes.
It serves as the foundational blueprint, ensuring alignment with project goals and providing a clear vision, crucial for the scientific integrity of the project.
The philosophy behind it is of clarity and coherence in scientific communication, ensuring the accessibility and understandability of scientific pursuits. It serves as the key ‘one-stop’ documentation that others can use for reproducibility.
This can be achieved utilising structured templates focusing on clarity and conciseness, avoiding jargon to ensure comprehension across diverse stakeholders.
Literature Review:
The literature review delves into existing research and scholarly discourse to construct a well-rounded understanding of the current knowledge landscape.
It identifies knowledge gaps and informs the research question, remembering that the most complex solution is not always the best. Better put, we don’t need to re-invent the wheel
The philosophy behind it is it embodies the cumulative and collaborative nature of scientific knowledge, building upon previous insights and discoveries.
Methodological Review:
The Methodological Review meticulously details the methodologies and processes employed, ensuring the validity and reproducibility of the project.
It safeguards the scientific rigour of the project, allowing for peer review and replication - which is crucial for the credibility and reliability of the findings.
The philosophy behind it aligns with the scientific method’s emphasis on systematic observation, measurement, and modification, ensuring the project’s adherence to scientific principles.
This can be achieved by developing comprehensive frameworks detailing each methodological step - allowing for scrutiny, validation, and replication. It is also important to have regular peer reviews in place as ‘steel manning’ the methodology is critical to the projects success.
Reproducibility & Documentation:
Reproducibility & Documentation is key to maintain exhaustive documentation and ensure the replicability of results and methodologies.
It upholds the integrity and transparency of the scientific process, allowing for the continual refinement and validation of scientific knowledge.
The philosophy behind it resonates with the scientific method’s core principle of reproducibility
This can be achieved by implementing rigorous documentation protocols and use version control systems like GitHub to facilitate collaborative development / review.
Testing & Validation:
Testing & Validation is key to rigorously validate hypotheses, models, and results to ascertain their robustness and reliability.
It mitigates biases and errors, ensuring the findings’ credibility and contributing to the robustness of scientific knowledge.
The philosophy behind it embodies the scientific method’s emphasis on empirical validation and critical assessment, refining the reliability of scientific insights.
This can be achieved by employing diverse testing methods, including statistical significance testing and sensitivity analysis, to validate and refine the findings.
Closing remarks:
Data Science is a dynamic interplay between human insight and scientific precision, acting as a guide towards decision-making and data discovery. By blending entrepreneurial insight with scientific methodologies, we formulate solutions that are technologically robust and attuned to human and commercial needs. It’s this synergy between the human and scientific elements that advances Data Science, making it an innovator, a pillar for organisational achievement, and a medium for delving into new realms of knowledge.
There is a dirty little secret you’ll learn over your Data Science career, and that is: Data Science is not solely about data and algorithms; at its core;
One of the best articles i have ever read about Data science and how it operates. I always thought it's about numbers and algorithims but the human factor was yet to cross my mind. You have remarkably explained in details how it works. Thank you for sharing this article.