Original is posted here (CZ) and here (EN)
Intro
Amid many global socio-political changes, big data and analytics have become essential tools for doing business and ensuring company growth. The ongoing rise of big data, including cloud computing, has reshaped global tech trends. In 2023, we expect a similar surge for new and innovative technologies, which will ensure more efficient processes and operations.
Cloud Adoption
According to Gartner, 70% of companies have already partially migrated to the cloud, and 95% of new solutions by 2025 will be deployed to the cloud. While this trend will continue in 2023, cloud migration has risks and challenges that should be considered among its benefits.
The first challenge is the risk of vendor lock-in. Once migrating your system to the cloud, your solution might be tightly linked to the services of a particular cloud provider. That's why it is essential to think about an exit strategy beforehand and use cloud-agnostic architecture to make further migrations possible. Another potential solution for this challenge is to rely on multi-cloud data solutions such as Snowflake/Databricks.
Moreover, each cloud provider has its strengths and weaknesses. Sometimes it may be better to choose one provider for machine learning and another for the data warehouse. There is an emerging need for inter-cloud technologies which enable parts of data solutions to seamlessly collaborate across services of different cloud providers (and often with on-premise systems).
Another challenge worth mentioning is that not all data systems can be hosted on public clouds. For instance, some regulatory limitations may prevent data from being put on public clouds or make it risky. Often companies that still want to use some cloud benefits decide to use inhouse cloud. In this case, they may benefit from virtualization platforms like OpenStack or use on-premise cloud services like Azure Stack.
Regulatory Demands
According to Gartner, 75% of the world’s population will have their personal data covered with GDPR-like regulations by 2024. Besides an obvious interest in proper data security, this causes greater interest in data governance as a framework for companies to understand and manage their data. In fact, data governance becomes not an internal ask of management which would like to increase efficiency and turn data into an asset, but a mandatory external requirement.
Data governance consists of many important elements, including the following:
- Data catalogue allows enterprises to track information about all their data assets systematically and ensures that no data is left outside the framework.
- Data lineage tracks data paths across companies and ensures a shared vision of data inputs, outputs and transformations on this path.
Data Democratization
- First, these are various self-service solutions that allow employees to play with data independently. These may be reporting solutions like PowerBI or low-code automation tools like Alteryx. Another way is to expose data via APIs and allow research via scripting languages like Python.
- The second aspect is the disclosure of metadata for employees to understand what data exists in the company and how they can be interpreted. Again, this brings us to the idea of data catalogs.
- The third important aspect is data literacy. It is not enough to disclose data. It is important to ensure that employees understand and work with data properly.
- And obviously, security and access separation aspects must be considered.
Artificial Intelligence (AI) Adoption
- Improving data observability through AI tools. These might be used to automate data discovery – for example, identify sensitive personal data, and find entities in data. Another potential area of application is data quality. AI tools may enable further automation of this process via automatic detection of data issues and even potentially automatically fixing them.
- Augmented analytics simplifies exploratory analysis and leverages AI tools used in data preparation / featuring stages.
- The topic of responsible AI becomes increasingly important. One should not start using AI model in production unless ensuring the fairness of input data and being able to explain its output.
Комментариев нет:
Отправить комментарий