Разработка • dev.to • 13 мая 2024 г. 3:40
For years, JDBC and ODBC have been commonly adopted norms for database interaction. Now, as we gaze upon the vast expanse of the data realm, the rise of data science and data lake analytics brings bigger and bigger datasets. Correspondingly, we need...... читать далее
database dataengineering python apachearrowРазработка • dev.to • 12 мая 2024 г. 8:46
Introduction Prometheus has established itself as a cornerstone in the monitoring and alerting ecosystem, favored for its straightforwardness and efficiency in handling real-time metrics. Central to its operation is a data model where each sample c...... читать далее
prometheus monitoring database dataengineeringРазработка • dev.to • 10 мая 2024 г. 17:57
Introduction If you are machine learning engineer and started to productionalize your model prediction pipeline you may came across below question. how to write memory efficient prediction data pipelines in python. Working with a dataset that is t...... читать далее
python machinelearning dataengineering googlecloudРазработка • dev.to • 9 мая 2024 г. 18:08
Pipeline Nesse artigo, vamos realizar um passo a passo, de como construir um pippeline de carga de dados utilizando a ferramenta Apache HOP. Nossa fonte de dados será um arquivo CSV, em seguida vamos realizar algumas transformações e carregar para...... читать далее
apachehop dataengineering postgres tutorialРазработка • dev.to • 7 мая 2024 г. 16:58
What is a Data Warehouse? Think of data warehouses or data lakes similar to Amazon warehouses where finished goods are sent for distribution. In the case of data, warehouses are a centralized location where company data from different department da...... читать далее
data dataengineering elt airbyteРазработка • dev.to • 6 мая 2024 г. 22:03
Chunkify Huge List into Smaller N equal size lists. In order to backfill data for one of our machine learning pipeline I have to divide the date list into small n list of equal length and distribute them at n GPU cluster. from datetime import tim...... читать далее
python machinelearning dataengineeringРазработка • dev.to • 6 мая 2024 г. 3:19
Ever scrolled through Netflix, Disney+ or Hulu and wondered what to watch next? You can: Curate your own watchlist Delve into recommendations from streaming services Use one of countless websites and apps Like many, I tried all of the above. The...... читать далее
webdev typescript python dataengineeringРазработка • dev.to • 3 мая 2024 г. 18:34
In my day to day work one of the most common use cases for Apache Airflow is to run hundreds of scheduled BigQuery SQL scripts. Developers who start with Airflow often ask the following questions How to use airflow to orchestrate sql? This post ai...... читать далее
airflow bigquery dataengineering sqlРазработка • dev.to • 3 мая 2024 г. 18:26
In this article we will go through the process to ingest data into Snowflake from Google Cloud Storage (GCS). Along the way we will understand the required concepts and tasks involved in a step by step manner. You can refer to the official docs here...... читать далее
snowflake gcp dataengineering dataРазработка • dev.to • 22 апреля 2024 г. 9:25
CAP Theorem: The CAP theorem, also known as Brewer's theorem, states that in a distributed computer system, it is impossible to simultaneously achieve all three of the following guarantees: Consistency (C): All nodes in the system have the same da...... читать далее
database distributedsystems dataengineeringРазработка • dev.to • 17 апреля 2024 г. 14:16
Graphs - 1 What is a graph ? A diagram in which a line or a curve shows the relationship between two quantities or more. Pictorial representation of relationship between the quantities. Type of Graphs :- Geometrical Graph Circle , ellipse, hyperb...... читать далее
data datascience dataengineering analystРазработка • dev.to • 15 апреля 2024 г. 15:03
The Data Engineering Academy takes pride in presenting the transformational story of Vish, who navigated the complexities of a shifting job market to become adept in data engineering, commencing an exciting new chapter in his professional life. Vis...... читать далее
dataengineering deacademy careerdevelopment careeradviceРазработка • dev.to • 14 апреля 2024 г. 19:48
As we all know, storage plays a vital role in a virtual machine as we need to store necessary data and as we know OS disk will not be enough and we may need extra storage to store the data. So let us see how can we attach the data disk to a VM. We al...... читать далее
vm dataengineering disk azureРазработка • dev.to • 11 апреля 2024 г. 9:37
As we navigate through 2024, the landscape of data engineering and science continues to evolve at a breakneck pace. With advancements in AI technology come new challenges, and professionals in these fields are grappling with a unique set of challenge...... читать далее
datascience python dataengineering dataРазработка • dev.to • 4 апреля 2024 г. 17:35
Discover the key techniques and strategies for effective column transformation in machine learning. Understanding Column Transformation Column transformation is a technique used in machine learning to preprocess data before feeding it into a model...... читать далее
machinelearning datascience dataengineering featureengineeringРазработка • dev.to • 28 марта 2024 г. 7:11
We have a database table TBLTEST. Below is part of its data: The data is ordered by date. We are trying to group rows by the first five columns, convert dates in each group into an interval, and record the ending date of the last record as the infi...... читать далее
sql database dataengineering developmentРазработка • dev.to • 26 марта 2024 г. 19:12
The importance of data in decision-making cannot be overstated. In today's digital age, data acts as a critical asset for organizations, enabling them to make informed decisions, identify trends, optimize operations, and drive strategic planning. The...... читать далее
data datascience dataengineering aiРазработка • dev.to • 18 марта 2024 г. 19:42
a little data story by Markus Schüler (Director of Data Strategy Adevinta) with drawings by Gitanjali Venkatraman (Technology Writer and Illustrator at ThoughtWorks) The profound impact of Data Mesh and its associated principles domain data owners...... читать далее
data datamesh dataengineeringРазработка • dev.to • 18 марта 2024 г. 4:01
We have a database table STAKEHOLDER as follows: We are trying to group the table by CLASS and convert all columns to a same row. Below is the desired result set: SQL code written in Oracle: WITH CTE AS( SELECT UP.CLASS,...... читать далее
sql database coding dataengineeringРазработка • dev.to • 10 марта 2024 г. 20:35
Recently, I've been designing a data lake to store different types of data from various sources, catering to diverse demands across different areas and levels. To determine the best file type for storing this data, I compiled points of interest, cons...... читать далее
dataengineering spark benchmark datascience