Creating a Data Set in Big Query
Big Query is designed to make large-scale data analysis accessible to everyone, if you are a developer or data analyst you are probably working with data, if your startups has small amount of data, you might be able to store it in spreadsheet, but as your amount of data grows to gigabytes, terabytes or even petabytes, you will need to have a more efficient system like a data warehouse where you can not only store data but also analyze it.
Traditionally larger sets of data mean longer times between asking your question and getting answers, have you ever needed to wait hours or days for an analytics report to be run? Big Query is designed to run massive amounts of data, such as log data from thousands of retail systems or IOT data from millions of vehicle sensors across the globe, it’s a fully managed and serverless data warehouse which allows you to focus on analytics instead of managing infrastructure, in big query data is stored in a structured table, which means that you can use standard SQL for easy querying and data analysis.
On your console go to Big Query and enable the API if you are using it for the first time and follow below steps to create a data set.

Give a unique name to your dataset and select a location in which you want to store your table,

you can also set an expiration for the data set just by enabling Table expiration and providing it number of days in which you want the table to be expired.

As we have the data set ready let’s create a sink to export logs to this data set in big query
Create Logs Routing Sink
To create a logs routing sink to export logs to your big query data set, go to Log Route from your logging dashboard to and follow below given process to create a sink.


Start with giving a name to your sink and click on next to do the further configuration like selecting the destination and applying the filters to selectively export logs,

Now select the destination between Cloud Storage, Pub/Sub, Big Query and the also select the name of the data set to which you want to store logs to.

Sometimes we only need logs of a specific service and sometimes we don’t need logs of a specific service to be exported for this Log route also allows you to apply filters, if you do not apply any filters all the generated logs will be exported, and it may consume more space which leads to growth in unwanted infrastructure cost.

below given code filters logs by only selecting logs of Google App Engine and with severity INFO, to learn GCP logging and to know how this code snippet is generated refer our blog on Cloud Logging and then create the sink.
severity=INFO resource.type="gae_app"

Once the sink is created the final step you have to do is give necessary IAM permissions required for writing to selected resource, go to your Logs Router dashboard and follow the steps as given below.

then copy the service account details and go to IAM dashboard to check whether required permissions are given to the service account linked with your sink, if not please grant those permissions.

This blog is written by Amit Kumar, Head of Engineering, Checkmate Global Technologies. Connect with him to understand startup’s MVP development best practices.
	