Azure Platform for Data Engineers(part-1)

Over the last 30 years, we’ve seen an exponential increase in the number of devices and software that generate data to meet current business and user needs. Businesses store, interpret, manage, transform, process, aggregate, and report this data to interested parties. These parties include internal management, investors, business partners, regulators, and consumers. Data consumers view data on PCs, tablets, and mobile devices that are either connected or disconnected. Consumers both generate and use data. They do this in the workplace and during leisure time with social media applications. Business stakeholders use data to make business decisions. Consumers use data to … Continue reading Azure Platform for Data Engineers(part-1)

Sticky post

Query GitHub data using BigQuery

BigQuery is Google’s fully managed, NoOps, low cost analytics database. With BigQuery you can query terabytes of data without needing a database administrator or any infrastructure to manage. BigQuery uses familiar SQL and a pay-only-for-what-you-use charging model. BigQuery allows you to focus on analyzing data to find meaningful insights. In this post we’ll see how to query the GitHub public dataset to grab hands on experience with it. Sign-in to Google Cloud Platform console (console.cloud.google.com) and navigate to BigQuery. You can also open the BigQuery web UI directly by entering the following URL in your browser. Accept the terms of service. … Continue reading Query GitHub data using BigQuery

Sticky post

Bayes Classification with Cloud Datalab, Spark, and Pig on Google Cloud

Note: If you are really following with post this job can take upto 1:30 hours to finish and if you stuck in a typo it will increase your resistance power In this post you will learn how to deploy a … Continue reading Bayes Classification with Cloud Datalab, Spark, and Pig on Google Cloud