Sticky post

.Net framework and Apache spark

Why choose .NET for Apache Spark? .NET for Apache Spark empowers developers with .NET experience or code bases to participate in the world of big data analytics. .NET for Apache Spark provides high performance APIs for using Spark from C# and F#. With C# and F#, you can access: DataFrame and SparkSQL for working with structured data. Spark Structured Streaming for working with streaming data. Spark SQL for writing queries with SQL syntax. Machine learning integration for faster training and prediction (that is, use .NET for Apache Spark alongside ML.NET). .NET for Apache Spark is compliant with .NET Standard, a formal … Continue reading .Net framework and Apache spark

Sticky post

Spark Cluster Overview

Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for machine learning, GraphX for graph processing, and Spark Streaming. Security in Spark is OFF by default. This could mean you are vulnerable to attack by default. Spark uses Hadoop’s client libraries for HDFS and YARN.  Users can also download a “Hadoop free” binary and run Spark with any Hadoop version by augmenting Spark’s classpath. Scala and Java users can … Continue reading Spark Cluster Overview

Running Spark on Azure Databricks

Who Should See This Blog Post? This course is intended for the people who wants to Run their Analytics workload and Machine Learning workload on DataBricks And chosen Azure as their distributor. Skills Required To Follow: SQL Machine Learning Why we need DataBricks and Azure services ….. well we can train one model on DataBricks and deploy it on Azure Services Before we can run a spark job we needs to create a computer cluster and before we create a computer cluster we need to create a DataBricks Workspace.All the work done in DataBricks need not to be done within … Continue reading Running Spark on Azure Databricks

Sticky post

Machine Learning Applications for Big Data(Regression):

Machine Learning: Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it … Continue reading Machine Learning Applications for Big Data(Regression):

You Can Blend Apache Spark And Tensorflow To Build Potential Deep Learning Solutions

Before we Start our journey let’s explore what is spark and what is tensorflow and why we want them to be combined. Apache Spark™ is a unified analytics engine for large-scale data processing. Features: Speed: Run workloads 100x faster. Apache Spark achieves high performance for both batch and streaming data, using a state-of-the-art DAG scheduler, a query optimizer, and a physical execution engine.(DAG means ) Logistic regression in Hadoop and Spark Ease of Use: Write applications quickly in Java, Scala, Python, R, and SQL.Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use … Continue reading You Can Blend Apache Spark And Tensorflow To Build Potential Deep Learning Solutions