Following-up from my email on finding ways to reduce waiting time in-between tasks, I

How to gain insights into where we are spending our time? about readme HOT 33 CLOSED

artsy commented on May 31, 2024 3

How to gain insights into where we are spending our time?

from readme.

Comments (33)

sweir27 commented on May 31, 2024 9

Our microservice architecture has contributed to small changes often taking much longer than they would otherwise.

A classic example is when we need to make an API change in a service like Exchange. You need to update Exchange + deploy, Metaphysics + deploy, Reaction + release, Force + deploy). God forbid you have a schema mismatch or you're doing a deprecation which then takes 2x as many PRs.

If you want to test the changes locally end-to-end first, you need to go through all the other pain that was mentioned above re: linking/setting up projects! So it tends to compound.

from readme.

orta commented on May 31, 2024 8

Harder to quantify, but maybe there's something about the amount of time it takes to feel up to date on slack?

from readme.

pepopowitz commented on May 31, 2024 6

Regarding meetings, it's less about quantity for me and more about batching. The days when I have meetings distributed throughout the day are hard from a context-switching perspective, and I feel like I can only focus on shallow work. The days when I have my meetings clustered together, I have less context-switching and I feel like I can get deeper work done. I much prefer the days when they are clustered, so maybe we could say "prefer scheduling meetings before 1pm", or something.

This almost might just be personal preference, and I'm willing to accept that.

from readme.

ashkan18 commented on May 31, 2024 5

I'd definitely say Test Suites and CI. Specifically I think of Volt and Gravity. I can imagine my own productivity would have been much higher when working on Volt and Gravity if I didn't have to wait ~40min to see my change getting into staging.
This in general impacts how we treat these services, Volt probably gets less frequent deploys just because PR -> Production time is too long and makes any Prod deploy slightly riskier. You know that a hot fix is at least 40minutes away.

from readme.

ashleyjelks commented on May 31, 2024 5

I think it would be valuable to consider the intention and purpose of the various meetings people have to (or are encouraged to) attend each week. Could some of this communication and information dissemination be asynchronous (Slack, email, Notion, etc). Thinking of teams that hold daily standups, especially ones that consistently last longer than 10-15 minutes, is that the most efficient use of time? Could more time be spent grooming/preparing for Sprint planning and Sprint retros and less time spent in these actual meetings?

from readme.

alloy commented on May 31, 2024 4

Test suite duration. Not immediately clear if and how long people sit around to wait for the suite to finish, but perhaps measuring it leads to insights after all.

from readme.

joeyAghion commented on May 31, 2024 4

My personal pet peeve is when work that's essentially done doesn't get promptly released. If it was worth doing, it should be worth releasing promptly. When things stall in staging, they also require further context switches to properly QA, migrate, and verify. All those can be handled more efficiently closer to the work.

from readme.

sweir27 commented on May 31, 2024 4

I spend a non-trivial amount of time waiting for my local force to load its first page.

from readme.

alloy commented on May 31, 2024 2

PR creation to deploy time. This is a large over-arching one that we could probably dig into further as we start gaining insights, but an example of that could be how often people have to fix merge conflicts before a PR ends up getting merged.

from readme.

ashfurrow commented on May 31, 2024 2

I'll re-iterate something that came out of the Local Discovery retro, which was that we were sometimes spending hours waiting for Emission CI builds to event start. We wasted far more engineering time than the few hundred dollars that we could have paid Circle CI to get iOS build concurrency. Our decision was, at the first sign of this happening again, to upgrade our plan.

I really like this framework of thinking of ways to reduce wasted engineering time, and the iOS Circle CI upgrade perfectly encapsulates how we can spend slightly more money to unlock a lot of engineering productivity. Sounds like there are some other instances (Gravity and Volt have already been mentioned).

from readme.

anandaroop commented on May 31, 2024 2

I find that I've only upvoted the social ones (Slack, meetings) and not really any of the technical ones just because context-switching is far more disruptive to my feeling of productivity that any incremental time spent waiting on systems (those pauses for me end up in any case being times to check Slack and email, or just take a breather).

from readme.

alloy commented on May 31, 2024 1

yarn install time, and especially rm -rf node_modules && yarn install

Same applies to bundle install and pod install.

from readme.

alloy commented on May 31, 2024 1

Time people spend getting yarn link etc to setup up correctly. Perhaps this gives us clear insight into how much we need a well functioning yarn workspaces setup?

from readme.

ashleyjelks commented on May 31, 2024 1

Dev environment setup and configuration, for example in linking Eigen/Emission or Reaction/Force.

from readme.

alloy commented on May 31, 2024 1

Dev environment setup and configuration, for example in linking Eigen/Emission or Reaction/Force.

@ashleyjelks Perhaps you’re touching on the same thing as I did above? #185 (comment)

from readme.

ashfurrow commented on May 31, 2024 1

Dev environment setup and configuration, for example in linking Eigen/Emission or Reaction/Force.

@alloy @ashleyjelks This is something we noticed in the LD/City Guides project: we routinely hit a problem where CocoaPods would fail to pod install (complaining, usually, about tipsi-stripe). A fix for this has been PR'd and is pending review: CocoaPods/CocoaPods#8676 This is a great case for how contributing to our tools, on Artsy time, can make our engineering team overall more productive 👍

from readme.

ashleyjelks commented on May 31, 2024 1

I would +1 @erikdstock's comment

I haven't seen any mention of it yet but I would say that for me personally pair programming is a net productivity gain.

from readme.

pepopowitz commented on May 31, 2024 1

Sometimes the tooling for Relay misleads you with errors. I've not yet learned when I can ignore a red squiggly line from Relay and when I can't, so I've spent a few wild goose chases trying to make them go away...only to find out from someone else that I don't really need to fix them.

@ashfurrow pointed out this example: https://artsyproduct.atlassian.net/browse/PLATFORM-1365

And right now, I'm dealing with some errors on this PR. Things seem to work fine in the browser and running yarn relay, but there are red squiggly lines in VS Code.

from readme.

sepans commented on May 31, 2024 1

@ashfurrow it is volt. I don't have much in depth knowledge about what could be improved. @starsirius and others spend a lot of effort improving it but the old tests in volt from 4-5 years ago are just very expensive to run.

~~This morning I literally removed a single comment line from a PR that was green and after 5 hours haven't yet been able to merge the PR:~~ 👇 this is more of an exception because of the release of the new version of chrome and @starsirius quickly fixed that 🙏 🙏

from readme.

starsirius commented on May 31, 2024 1

I want to separate out the issues we have now with long CI wait time, and we can address them individually:

We don't have sufficient containers on CircleCI to build projects in parallel during peak hours.
Tests in some projects take significantly longer, e.g. running entire test suite takes ~15 mins for Volt, and ~25 mins for Gravity.

We can probably get some insights from CircleCI about both, i.e. time spent when a job was simply queued and time spent when executing. Also want to mention the CircleCI workflow might also contribute the total wait time given our limited containers.

from readme.

alloy commented on May 31, 2024

~~So here’s a list of just some of the things I think we could consider:~~

I like how you all are creating single comments and then we can upvote those, so going to split up my list too. I’ll start with:

Meetings. Not sure yet how to think about that one, but I’m sure we could optimise something there–and I think it’s everyone’s favourite to have less of.

from readme.

oxaudo commented on May 31, 2024

Recently after rejoining gallery team - not negligible time spent addressing subscription corrections that have to be done through console. The issue there is mainly trying to understand what happens that the adjustments are needed. Granted those usually not standard cases - so tricky to automate.

from readme.

alloy commented on May 31, 2024

Git commit hook duration. Probably everybody waits for that to finish instead of doing something else in the meantime.

from readme.

alloy commented on May 31, 2024

have to be done through console

@oxaudo We certainly could consider starting to record time of opening a (production) console to closing it and then see if we can/need to dig into that further?

from readme.

erikdstock commented on May 31, 2024

Test & deploy time don't often affect the services I mainly work in so I feel less inclined to comment on those, but I can remember them being pain points in the past.

I'd echo the comments of @ashleyjelks, @pepopowitz and @alloy above about meetings. In my ideal world they would be scheduled in a continuous block and would have a clear purpose/product/outcome. Additionally some meetings (L&L, S&T) are definitely valuable but hard to weigh against immediate demands and deadlines- in that moment they feel like extracurricular activities and contribute to my feeling of meeting bloat (the result is that I only attend those I'm definitely interested in, but I feel guilty about it).

I also feel pretty strongly that standups should be very brief yesterday->today->blockers updates and should not last longer than 1 minute per person (this seems like something every team at every company has to balance though). Followups can happen immediately after.

I haven't seen any mention of it yet but I would say that for me personally pair programming is a net productivity gain.

from readme.

ashleyjelks commented on May 31, 2024

Dev environment setup and configuration, for example in linking Eigen/Emission or Reaction/Force.

@ashleyjelks Perhaps you’re touching on the same thing as I did above? #185 (comment)

Yep, its a duplicate issue. I'm happy to delete this one!

from readme.

ashkan18 commented on May 31, 2024

just to back up my concern with time spent waiting for CI , this is a build for one liner PR thats waiting for getting to staging for 2hrs now.

in this specific example, follow up works for updating Fulcrum, MP and Reaction are blocked by this.

Its hard to predict CI's traffic pattern and these cases don't happen too often, but they happen enough times to make it considerable amount of waiting time total.

from readme.

sepans commented on May 31, 2024

I agree with almost every single point above but for me the most annoying thing is test run time.

I think slow tests sometimes have more implications than just wait time. For example a lot of times when reviewing a PR I am hesitant about asking a minor change because correcting a minor problem would block the entire org waiting for circleci time.

This part of this blog post might sound a little bit extreme but I think we all agree the 'mean time to recovery' is very important and we might be too much focused on increasing 'mean time between failures':

Do we ever release bugs to production? Of course. But mean time to recovery is usually more important than mean time between failures. If we deploy something that’s broken, we can often roll back within minutes. And since we ship very incremental changes, the average bug is often limited in impact. Bugs in production are often related to code that was written in the last few days, so it’s fresh in mind and can be fixed quickly.

from readme.

ashfurrow commented on May 31, 2024

@sepans which projects have you worked with that have the longest test times? Does this mostly cause a problem locally or on CI? (Or both?)

That's a really interesting blog post – I think it's very apropos.

from readme.

alloy commented on May 31, 2024

I'll put together a list of these sorted by votes and review in what ways we could track the time spent on these issues.

from readme.

alloy commented on May 31, 2024

Ok, Peril is going to keep adding back the RFC label. I'll just close this as accepted and will follow-up with resolution steps.

from readme.

alloy commented on May 31, 2024

Turns out [RFC] was added to the title, which is why peril kept adding that. Re-opening this now that I've removed that.

from readme.

erikdstock commented on May 31, 2024

Closing this issue as stale

from readme.

How to gain insights into where we are spending our time? about readme HOT 33 CLOSED

Comments (33)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent