Git Product home page Git Product logo

devworxco / cloudfront-logs-java16 Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 444 KB

The AWS Cloudfront logs parser is a utility that traverses a directory of AWS Cloudfront log files and converts it into a local relational database - HSQLDB. It allowed the author to quickly get to a position to run some SQL queries against his log traffic and provided an excuse to try out some of the latest Java 16 features.

Home Page: https://www.devworx.co.uk

License: Apache License 2.0

Java 100.00%
cloudfront-logs hsqldb sql java16

cloudfront-logs-java16's Introduction

AWS Cloudfront Logs Parser

The AWS Cloudfront logs parser is a utility that traverses a directory of AWS Cloudfront log files and converts it into a local relational database - HSQLDB

AWS Cloudfront Access Logs Query

In essence, it is a simple, standalone utility that allowed the author to quickly get to a position to run some SQL queries against his log traffic and provided an excuse to try out some of the latest Java 16 features.

Getting Started

Log Files Download

You need to make the AWS Cloudfront log files available in a location that can be read by the utility. To do this, you need to install the AWS command line interface - https://aws.amazon.com/cli/ - and configure it appropriately:

aws configure

Once this is completed, you can synchronize your remote AWS S3 log directory to a local path. For instance:

aws s3 sync s3://devworx.co.uk-access-logs .

Build the Utility

To build the utility, simply execute the standard Apache Maven build command:

mvn clean install

Run the Utility

The build will produce an executable JAR file - target/cloudfront-exec.jar. You can now build an HSQLDB file with the following command:

java --enable-preview -jar target/cloudfront-exec.jar /home/jsteenkamp/aws-access-logs  /home/jsteenkamp/aws-access-logs-database

Where:

  • /home/jsteenkamp/aws-access-logs is the directory to where you have downloaded your AWS Cloudfront log files from S3

  • /home/jsteenkamp/aws-access-logs-database is the directory / database to where you want to load these items into. This same URI will later be used to connect to the database through DBeaver.

Connect to the Database using DBeaver

Although you can use any tool that supports JDBC in order to connect to the HSQLDB file, in this example we will be using the excellent DBeaver product.

Once you have downloaded and started up the application, create a new connection and select HSQLDB Embedded:

HSQLDB Connection

Make sure your Path corresponds to where you created the database. The Username is sa while the Password is left blank.

Setting up the Connection

When prompted to download the drivers, please do so. You are now ready to start querying the database.

Common Errors

Wrong Version of Java

If you encounter errors such as the ones listed below, you are not using Java 16+. Please ensure that you download an appropriate JDK from sites such as https://adoptopenjdk.net/ or https://www.azul.com/downloads/zulu-community/?package=jdk

(class file version 59.65535) was compiled with preview features that are unsupported. This version of the Java Runtime only recognizes preview features for class file version 55.65535
in project cloudfront-logs-java16: Fatal error compiling: error: invalid target release: 16 -> [Help 1]

Long Entries in Log File

During development, it was occasionally noticed that certain fields in the AWS log file were excessively large. For instance, the CS_URI_QUERY field would cause this exception on the database when running the script:

Caused by: java.lang.RuntimeException: Unable to execute the SQL Batch - got the exception : java.sql.BatchUpdateException: data exception: string data, right truncation;  table: CLOUDFRONT_LOGS column: CS_URI_QUERY - current count : 21000
	at uk.co.devworx.cloudfront.CloudfrontHSQLDBCreator.lambda$process$0(CloudfrontHSQLDBCreator.java:205)
	at uk.co.devworx.cloudfront.CloudfrontReader.tryAdvance(CloudfrontReader.java:115)
	... 70 more
Caused by: java.sql.BatchUpdateException: data exception: string data, right truncation;  table: CLOUDFRONT_LOGS column: CS_URI_QUERY
	at org.hsqldb.jdbc.JDBCPreparedStatement.executeBatch(Unknown Source)

This is almost certainly a corrupted request (hack attempt?). For now the approach has been taken to simply extend the size of the column in the table creation script - src/main/resources/02-table-create-script.sql :

CS_URI_QUERY                            VARCHAR(10024),

It may be the case that you encounter this for other fields also. If so, please reach out and let me know (we may want to put in a more permanent fix for this).

cloudfront-logs-java16's People

Contributors

jjsteenkamp avatar

Watchers

 avatar

cloudfront-logs-java16's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.