Comments (3)
separation between stdout and stderr outputs
+1
remove HTTP method (
GET
), remove HTTP version (HTTP/1.1
)
I've seen and appreciated those entries with other servers too, so I'd prefer not to remove them. In fact, it might make sense to add two more standard entries:
- HTTP status code
- size of response in bytes
remove path components (
/brouter?
)
That would be too much of a cleanup IMO. There might very well be different HTTP endpoints served from the same binary in the future (perhaps even today? - haven't checked), e.g. a v2, or additional helper services.
IP address of client
https://brouter.de/privacypolicy.html (as linked from https://brouter.de/brouter-web) returns 404. For GDPR compliance it should be clearly stated for what reasons and for how long that data is retained, though. See https://bikerouter.de for how it could be done properly.
Before that is fixed, storing the IP should be disabled immediately. IOW, logging the IP should be off by default, or it should be replaced by a static placeholder. Logging the real IP should be an option (config or env), to be enabled only by admins who know what they are doing (see paragraph above).
Is logging the IP even needed, or would a hash be enough?
from brouter.
Hallo marcus+msb48,
restored the https://brouter.de/privacypolicy.html ...
difficult for me to comment the RFC if you do not say what you plan to do with the logs. As I sayd, I am doing just some access statistics and quality assurance for the quality of service for the brouter.de instance. Here, parallel sessions summer peak is about 200 (so capping at 999 theoretical so far) and requests-per-day summer peak is about 400.000
Yesterday, however, requests per day was 900.000 and looking at the log shows what happened:
18.05.24 11:47 127 ip=147.45.47.133 ms=1 -> GET /brouter/suspects/Austria/Vienna/confirmed/843514405793336318/fixed?ndays=30-1))%20OR%20938=(SELECT%20938%20FROM%20PG_SLEEP(15))-- HTTP/1.1
Lessons learned her: 1) more then just standard routing requests in that log, 2) intrusion detection needs unformatted infoprmation, 3) IP adress needed to add it on the blacklist
However, for context you need to know there's the nginx access/error logs in addition, that do contain the IP and the URL as well. On brouter.de, they are setup for daily rotate/gzip and 2 weeks archive
brouter-logging up to now is rotated manually (which does no work well)
So depends, if you want to have a long-time archive of access statistics (and I would like to have..) that should be some hybrid that works in conjunction with the nginx-log, having the brouter-log free of IPs for GDPR compliance.
Hard to believe for me that you will get happy here with structured JSON when talking about 100 Mio Requests per year. Other Aspect here is that if the JSON comes with a library dependency then it comes with a price... ((Geo)JSON up to now we create low-level)
So maybe Marcus you can comment on the intended usage of the log?
regards, Arndt
from brouter.
Hi @abrensch,
my motivation for introducing structured logging is to analyze more efficiently. The structured logs can be directly imported into modern log management systems like Elasticsearch and then searched, analyzed, aggregated, or visualized with frontends like Kibana, Graylog, Grafana, etc.
Examples:
- “Which are the most used routing profiles for the winter months vs. summer months?“,
- “How often are the route alternatives used with a certain routing profile?”
- “Show me the distribution of processing times vs. routing profiles”
All of this is possible with plain text logs too, but it requires more effort to parse the logs and extract the relevant information.
I'm using Elasticsearch, Graylog and Kibana for other projects. Depending on the time of day, hundreds to thousands of messages are ingested every second on some small virtual servers. So performance is definitely not an issue. 100M messages would require only a few gigabytes of storage, which is negligible nowadays.
As the log format is relatively simple, building the messages manually would not be a hard task if no further external dependencies are allowed.
Of course it'd be easily possible to pick up the existing log messages and convert them to structured logs by an external tool or script. Integration into BRouter is not necessarily important to me, but I thought it might also be useful for other server operators.
Re IP addresses/GDPR:
Logging, identifying, and blocking rogue clients are better done at least one level above BRouter, for example, in the reverse proxy, before the request reaches the BRouter server.
This way, the IP address can easily be printed in a hashed form to BRouter's log messages (even the session pool could work with the hashes instead of plain IP addresses).
In my development setup, I've already implemented logging hashes: SHA256(IP address + User-Agent), then I extract the first 10 characters. This way, each client remains unique but is not traceable to an IP address:
2024-05-19T10:46:18.273+02:00 new a9f14018d5 167 GET /?lonlats=13.377485,52.516247%7C13.351221,52.515004&profile=trekking&alternativeidx=0&format=geojson HTTP/1.1
2024-05-19T10:46:27.256+02:00 1 a9f14018d5 90 GET /?lonlats=13.377485,52.516247%7C13.351221,52.515004&profile=trekking&alternativeidx=0&format=geojson HTTP/1.1
2024-05-19T10:46:27.762+02:00 1 a9f14018d5 97 GET /?lonlats=13.377485,52.516247%7C13.351221,52.515004&profile=trekking&alternativeidx=0&format=geojson HTTP/1.1
2024-05-19T10:46:28.157+02:00 1 a9f14018d5 53 GET /?lonlats=13.377485,52.516247%7C13.351221,52.515004&profile=trekking&alternativeidx=0&format=geojson HTTP/1.1
@msb48, thank you for your remarks regarding the logged HTTP information. It makes total sense. I'll update the table in the first post accordingly.
from brouter.
Related Issues (20)
- Intermediate nodes on a route break the Data table's Time and Energy values HOT 1
- Car route should not begin/end on a motorway HOT 8
- Add the trigger type to the suspect manager list
- Wrongly reporting restriction:bicycle as bad-tr
- Support for node-tags in lookups.dat HOT 8
- "bicycle:backward=no" is not considered HOT 1
- Generating segments by process_pbf_planet.sh fails HOT 5
- Distinguish between villages/towns and country HOT 11
- Problem building from Dockerfile HOT 4
- Brouter Java error HOT 7
- The least cost route is not being used HOT 3
- Make timestamp format in server log to ISO8601 compatible HOT 1
- Using segment cost instead of distance when processing bicycle kinematic model. HOT 22
- Success message on routing HOT 4
- bug with brouter.damsy HOT 1
- Cycling overlay doesn't work HOT 4
- Problem with communication between BRouter lib 1.7.5 and BRouter-web
- Car Profiles Not Working v1.7.5 HOT 7
- Adding waypoints beyond 9 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from brouter.