FITFLOP
Home

bigdata (14 post)


posts by category not found!

Apache Ranger Build Error : Failed to create assembly: Error creating assembly archive schema-registry-plugin: Problem creating jar

Resolving Apache Ranger Build Error Failed to Create Assembly Error Creating Assembly Archive schema registry plugin When working with Apache Ranger a common is

2 min read 18-10-2024 37
Apache Ranger Build Error : Failed to create assembly: Error creating assembly archive schema-registry-plugin: Problem creating jar
Apache Ranger Build Error : Failed to create assembly: Error creating assembly archive schema-registry-plugin: Problem creating jar

How to convert latitude and longitude columns in parquet format dataframe to point type (geometry) with Apache Sedona?

Transforming Latitude and Longitude Columns to Geometry Points with Apache Sedona Working with spatial data in a big data context often involves converting lati

2 min read 07-10-2024 24
How to convert latitude and longitude columns in parquet format dataframe to point type (geometry) with Apache Sedona?
How to convert latitude and longitude columns in parquet format dataframe to point type (geometry) with Apache Sedona?

Load huge dataSet on UI from server as response object

Efficiently Loading Large Datasets on Your UI A Developers Guide Loading massive datasets onto a user interface can be a significant challenge Users expect resp

3 min read 05-10-2024 24
Load huge dataSet on UI from server as response object
Load huge dataSet on UI from server as response object

GeoMesa Accumulo custom iterator

Custom Iterators for Geo Mesa Accumulo Boosting Spatial Data Analysis Performance Geo Mesa a powerful open source geospatial data management system leverages Ap

2 min read 04-10-2024 31
GeoMesa Accumulo custom iterator
GeoMesa Accumulo custom iterator

Processing 12 Million records through Spark

Processing 12 Million Records with Spark A Guide to Efficient Data Manipulation Imagine you have a dataset containing 12 million records It could be customer da

2 min read 04-10-2024 32
Processing 12 Million records through Spark
Processing 12 Million records through Spark

Batch jobs in microservice with Apache spark as a solution

Batch Processing in Microservices Leveraging Apache Spark for Scalability and Efficiency Microservices architecture has become increasingly popular for its flex

3 min read 04-10-2024 62
Batch jobs in microservice with Apache spark as a solution
Batch jobs in microservice with Apache spark as a solution

The problem of "The application contains no execute() calls" in Flink

The application contains no execute calls in Flink A Common Pitfall and Its Solution You re working on a Flink application and suddenly encounter the cryptic er

2 min read 03-10-2024 34
The problem of "The application contains no execute() calls" in Flink
The problem of "The application contains no execute() calls" in Flink

Tools implementing management and usage of indexes on WORM data storage like Apache Parquet files

Mastering Index Management for WORM Data A Guide to Apache Parquet and Beyond Working with Write Once Read Many WORM data storage like Apache Parquet files pres

3 min read 01-10-2024 62
Tools implementing management and usage of indexes on WORM data storage like Apache Parquet files
Tools implementing management and usage of indexes on WORM data storage like Apache Parquet files

How to modify a STRUCT type column?

How to Modify a STRUCT Type Column in Databases In modern databases the use of complex data types such as STRUCT also known as RECORD or OBJECT has become incre

3 min read 01-10-2024 31
How to modify a STRUCT type column?
How to modify a STRUCT type column?

How can I sort CSV files by columns like we see in the spreadsheets?

Sorting CSV Files Like a Spreadsheet Pro Have you ever needed to organize your CSV data quickly just like you would in a spreadsheet program Sorting by specific

2 min read 30-09-2024 25
How can I sort CSV files by columns like we see in the spreadsheets?
How can I sort CSV files by columns like we see in the spreadsheets?

Connection reset issue while adding column on a large hbase table using Phoenix

Troubleshoot Connection Reset Errors When Adding Columns to Large H Base Tables in Phoenix Adding a column to a large H Base table using Phoenix can sometimes l

2 min read 30-09-2024 32
Connection reset issue while adding column on a large hbase table using Phoenix
Connection reset issue while adding column on a large hbase table using Phoenix

Handle dataframe with ~5 billion rows

Tackling Dataframe Giants Handling 5 Billion Rows with Grace Working with a dataframe containing 5 billion rows presents a significant challenge Traditional dat

2 min read 30-09-2024 27
Handle dataframe with ~5 billion rows
Handle dataframe with ~5 billion rows

Create a kmer database from a huge csv file

Building a K mer Database from a Massive CSV File A Guide Imagine you have a gigantic CSV file filled with DNA sequences and you need to analyze them for patter

3 min read 30-09-2024 30
Create a kmer database from a huge csv file
Create a kmer database from a huge csv file

How to Deploy Replicaset and Custom Images in AWS via Ray Docker Images?

Scaling Up with Ray and Docker Deploying a Replicaset of Custom Images in AWS Ray the popular open source framework for distributed Python applications simplifi

2 min read 29-09-2024 27
How to Deploy Replicaset and Custom Images in AWS via Ray Docker Images?
How to Deploy Replicaset and Custom Images in AWS via Ray Docker Images?