detecting fraud with decision tree and spark

Apache Spark™ is a unified analytics engine for large-scale data processing. Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools including Spark SQL for SQL and structured data processing, MLlib for…

Machine Learning Applications for Big Data(Regression):

Machine Learning: Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it learn for themselves. Big Data Big Data is a phrase used to mean a massive volume…

Big Data Essentials(unix command,DFS):

from internet search you will have a result like this for Big data definition: extremely large data sets that may be analysed computationally to reveal patterns, trends, and associations, especially relating to human behaviour and interactions.”much IT investment is going towards managing and maintaining big data” . let’s understand it first: every time you search…

Why BigQuery is The Next Big Thing With Example

BigQuery is Google’s serverless, highly scalable, enterprise data warehouse designed to make all your data analysts productive at an unmatched price-performance. Because there is no infrastructure to manage, you can focus on analyzing data to find meaningful insights using familiar SQL without the need for a database administrator. Analyze all your data by creating a…

How To Do Business Reporting in Hive Using HDP

The Apache Hive ™ data warehouse software facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage. A command line tool and JDBC driver are provided to connect users to Hive. Now as you don’t want to manage Hadoop cluster by yourself you…