In the previous article and video, I explained the important factors in having a successful data analytics team. One of those aspects is the team, the people, and their roles. In this article and video, I will explain the roles involved in the data analytics project and the types of tasks and requirements for each role.
Video
Part 1: Building the Analytics team in the golden era of technology
If you haven’t read the part 1 of this article, I suggest you start with that first;
Roles in the Analytics team
There are many roles involved in building the analytics team, some of which are external (outside of the team but still part of the mission and goal of the team). Here are some of those roles. Depending on the size of the team, you might have some or a combination of these;
- Architect
- Administrator
- ETL Developer
- Database Developer
- Data Engineer
- Data Scientist
- Data Modeler
- Data Analyst
- Project Manager
- Business Analyst
- Tester
- Deployment manager
- Analytics manager, team leader
- Project sponsor
- Application Developer / Programmer
- Consultant
Now, let’s explain each role, responsibilities, tasks, etc.
Project sponsor
This is one of the roles that is not technical and may not be even inside the analytics team. However, this role plays a very important role in the analytics project. If this role does not exist, it would be hard for the executive leadership team to understand the value of analytics. After some time, the analytics team may not have enough budget to continue the work. This role communicates the value of analytics to the business stakeholders and gets their approval for a budget for the analytics team.
Here are some characteristics of this role;
- CXO level
- Supports the analytical project in the board meeting
- Creates logical relations from the board team and vision of the organization to the analytical project
- Pass on the priorities from the board to the analytics team
Business Analyst
This is another non-technical role in the analytics team. A business analyst is someone who has been working in the business long enough to understand how the business operates, what systems are used for what operations, what metrics and analytics are important for each operation process, and what system has the data related to that. Business analysts also spend a lot of time gathering requirements, meeting with business stakeholders, and passing these requirements on to the rest of the team for the implementation phase.
here are some characteristics of this role;
- Having subject matter expertise
- Requirement gathering
- Understanding the requirements of business stakeholders
- Understanding if the data for such a requirement exists in the source system
- Providing the requirement to the technical team for starting the implementation
Project Manager
Like any other team in the organization, the Analytics team works on projects that need to be managed. A project manager is another non-technical role that would manage the project within the timelines and milestones and under the approved budget. The project manager works with the development team and has regular updates and meetings to discover any risks in meeting the deadlines. The project manager will also work on resource planning.
Here are some characteristics of this role;
- Managing the entire analytics project from start to finish
- Ensuring the project meets the deadline
- Manages the risks happening in the middle of the work
- Work with the team on resource management to fit into the deadlines
- Work with the business to provide regular updates about the progress
Administrator
The Administrator mentioned here means a technical administrator; someone who will take care of the administration of databases, data warehouses, Power BI and Fabric tenant settings, and workspace administrations, ensures backups are managed regularly, has version control in place, controls and checks the audit logs of how items are used and in case capacity requires extra power, etc.
here are some of the characteristics of this role;
- Database Administration
- Power BI and Fabric tenant settings
- Workspace administration
- Gateway administration
- Backups, versioning
- Usage metrics reports
ETL Developer
ETL stands for Extract, Transform, and Load. An ETL Developer is responsible for getting data from operational source systems and integrating them into a data warehouse. There are tools that an ETL developer can use efficiently, such as dataflows, data pipelines, SSIS, T-SQL, and even sometimes custom applications for API calls. ETL developers also need to understand some of the data warehousing concepts, such as SCD, and how to implement them.
Here are some of the characteristics of this role;
- Knowing data integration tools: Data Factory, Data Pipeline, Dataflow, Power Query, SSIS
- Knowing data integration terms; SCD, Inferred dimension member, late arriving fact tables, and know how to implement those using the tools.
Database Developer
Depending on the team size, you may have this role and the ETL developer combined into the tasks of one person. A database developer is a person who mainly works with database tables, writes T-SQL queries, stored procedures, and views, and works on the performance of the scripts running on the database. This role should also have a good understanding of data warehousing concepts such as SCD and inferred dimension member, star schema etc.
Here are some characteristics of this role;
- SQL scripts (stored procedures, views, etc.)
- Implementing things such as SCD (Slowly Changing Dimension)
- Working on the database performance whenever needed
Data Engineer
Sometimes, some analytics teams need to work with data at different levels. The data might be so big that it might come as raw files, requiring some handling before loading it into the database or data warehouse. A data engineer usually works with tools such as Notebook and uses languages such as Python, R, Scala, and Spark SQL to work with big data engines such as the Spark engine and knows how to optimize the usage of Spark pools with customization on the nodes.
Here are some characteristics of this role;
- Working with Spark
- Configuring spark pools
- Using Notebook with languages such as
- PySpark
- SparkR
- Scala
- Spark SQL
- Working with Lakehouses
Data Scientist
Often, data analytics projects get into the realm of predictive or even prescriptive analytics. A data scientist steps beyond descriptive analytics (analyzing the data from the past), and finds patterns in the data to do predictive analytics (predicting what is about to happen), and sometimes even recommending the action to be taken (prescriptive analytics). A data scientist uses tools such as Notebook and languages such as Python and R but also has a deep understanding of machine learning algorithms (i.e., decision tree, clustering, etc) and how to customize them using parameters. A data scientist runs models on the data regularly to find the patterns, test it against the test data until a desired outcome is achieved, and operationalize it. This process is ongoing because the data changes.
Here are some characteristics of this role;
- Understanding of data science process
- Knowing about machine learning algorithms
- Knowing libraries for machine learning: SparkML, LightGBM …
- Knowing frameworks for MLOps; MLFlow
- Knowing how to train models, use algorithms, change hyperparameters, evaluate the results, repeat the cycle until you get a reasonable outcome, and operationalize the best model is essential.
Data Modeler (Power BI)
With technologies such as Fabric and Power BI, a semantic model must be built on top of the data warehouse. This place will have extra logic as calculations are added to the analytics projects. A data modeler is someone who has a good understanding of Power BI modeling, relationships in Power BI, and direction of relationships, has a good understanding of DAX, and can write DAX measures and calculations to answer analytical requirements.
Here are some characteristics of this role;
- Star schema (Dimension and Fact tables)
- Table relationships in Power BI (and knowing things such as both-directional vs. single-directional relationships, direction of relationship, cardinality, etc)
- DAX; Data Analysis eXpression language
- Writing calculations in the model to serve the requirement
Data Analyst (Visualization)
Visualization is an art; it is important that the team spend time to create effective visualizations. A good visualization conveys the right message to the users. All the work you do in a BI system is like the work done in the kitchen of a restaurant, and the visualization is the plate that comes to the table for customers. A data visualizer or analyst needs to know different kinds of visualizations, the pros, and cons of using each, the best way to present items and also knows the technical aspects of how to use a tool such as Power BI to get those ideas to life.
Here are some characteristics of this role;
- Understanding the art of data visualization
- Knowing about data visualization technologies such as Power BI
- Knowing how to build effective visualization
- Understanding different types of visuals and how to use them in the current tools to answer the business’s requirements.
Architect
In an analytics project, many components are involved: the data warehouse, the ETL process, the staging database, the deployment pipeline, the semantic model, the reports, the workspaces, etc. Combining all of these items together needs to be done correctly so that the entire solution can be maintained easily and the team can work together in an environment where changes and improvements can be made regularly and often. An architect is someone who has a good understanding of each part of the technology and how it works with another part. The architect is someone who would come up with the layout of the tools and technologies to use and the plan for facing any issues.
Here are some of the characteristics of this role;
- Knowing about all tools, services, products, and how they are used in the project. Not in a deep dive level, but enough to know how to combine their usage together.
- Suggest the best combination of tools and technologies for the team to use
- Suggest a layout that uses different components and combines them together.
- Define standards and conventions for the team to use
- Staying up-to-date with the new technologies and suggest using them whenever needed
Application developer | Programmer
An analytics team may not need a full-time programmer, but based on my experience with some of my customers, having a programmer on the team often helps. This can be for scenarios where you want to embed the Power BI reports in a custom application or when you want to get data from a source system that requires specific handling through code.
Here are some characteristics of this role;
- Having expertise in programming languages and understanding how to use those to work with analytical objects
- Front-end/back-end development experience
- The involvement might be in things such as;
- Building a configuration app for the analytics project
- Embedding reports and visualizations into custom web applications
- Integrate API into the data analytics projects
Tester
The work done by the team’s developers needs to be tested. Testing might involve simple functions such as navigations in the report pages, or it can also involve complex but important items such as reconciling the numbers with the source systems and ensuring the report’s outcome is correct.
Here are some characteristics of this role;
- Have an understanding of business requirement
- Testing the current reports, measures, calculations
- Reconciling numbers
- Checking all the buttons and navigations to work as expected
- Checking the security configurations and who has access to what
- Giving a thumbs up to the deployment manager for the next step
- Giving feedback to developers for fixing potential issues
Deployment manager
This role often gets combined with the Administrator (depending on the team size). This role checks with the tester to get the thumbs up for quality control and also works with developers and the rest of the team to ensure that the content is ready to be published live for the end users. It also ensures that there is a backup plan to restore to the previous live version in case something goes wrong.
here are some of the characteristics of this role;
- Creating deployment pipelines
- Building environments and managing Dev, Test, and Prod
- Controlling the version controlling
- Controlling the deployment history
- Ensuring the tests are done before a deployment
- Rolling back the previous version whenever needed.
Team Leader | Manager
Like any other team, the analytics team requires a leader. This role would ensure the team works together as a unit towards the goal of this team. This role works closely with each team member to ensure they are at their best, works to ensure the team’s morale is high, and does that by not only meeting but also having team functions, etc. This role also works closely with the project sponsor to ensure the analytics team is following the vision of the project and team.
Here are some of the characteristics of this role;
- Manage the team, people management
- Ensures the team works together as one unit
- Ensures the team together moves towards the common analytical project’s goal
- Help the project manager with resourcing
- Working on the morale of the team (team functions…)
- Working with the architect to develop an architecture that works within the current budget for the project.
Consultant
This is an external role for the team. Sometimes, on a demand basis, the team might need help from someone experienced externally. A consultant’s help can come for the architecture, performance tuning, or even sometimes for speeding up the process of writing a DAX formula.
Here are some of the characteristics of this role;
- On-demand
- Performance tuning
- Best practices advice
- Peer review
- Bringing expertise from a wider range of use cases
- Usually not a full-time role
Self-Service | Superusers
Depending on the culture in the organization, you may have business analysts and self-service users in different departments who want to get their hands on the data provided by the analytics team, combine it with their own datasets, and build their own reports and dashboards. This often helps the analytics team by reducing their workload, but it will also require proper governance practice.
here are some characteristics for this role;
- Outside of the analytics team
- Helping on the analytics project
- They will need proper training for their requirement
- They will need proper support from the analytics team with the data
Summary
In summary, there are many roles involved in an analytics team. However, depending on the size of the team, some of these roles might be combined into one. I highly recommend reading my other article about building a successful analytics team, which talks about other aspects of it and not just the team roles.