Is this a new bug in dbt-core? <li class="task-l

Thanks for reporting this <a class="user-mention notranslate" data-hovercard-type="use

I just found out this same issue! I was running <code class="notrans

[CT-3525] [Bug] Option `--no-send-anonymous-usage-stats` creates different behavior compared to `DBT_SEND_ANONYMOUS_USAGE_STATS=false` about dbt-core HOT 4 OPEN

mbarugelCA commented on June 12, 2024

[CT-3525] [Bug] Option `--no-send-anonymous-usage-stats` creates different behavior compared to `DBT_SEND_ANONYMOUS_USAGE_STATS=false`

from dbt-core.

Comments (4)

dbeatty10 commented on June 12, 2024 1

Thanks for reporting this @mbarugelCA 🙏

I was able to reproduce the same thing that you reported.

Reprex

First, I needed to make sure I had all the relevant environment variables unset:

echo $DBT_SEND_ANONYMOUS_USAGE_STATS
echo $DO_NOT_TRACK
unset DBT_SEND_ANONYMOUS_USAGE_STATS
unset DO_NOT_TRACK

From there, I ran all the following commands to see what dbt is doing:

dbt --debug ls -s something --send-anonymous-usage-stats | grep -Eo "'send_anonymous_usage_stats': '(.*)'"
DBT_SEND_ANONYMOUS_USAGE_STATS=true dbt --debug ls -s something | grep -Eo "'send_anonymous_usage_stats': '(.*)'"
dbt --debug ls -s something --no-send-anonymous-usage-stats | grep -Eo "'send_anonymous_usage_stats': '(.*)'"
DBT_SEND_ANONYMOUS_USAGE_STATS=false dbt --debug ls -s something | grep -Eo "'send_anonymous_usage_stats': '(.*)'" 
DO_NOT_TRACK=true dbt --debug ls -s something | grep -Eo "'send_anonymous_usage_stats': '(.*)'"

Explanation:

--debug writes debug-level logs to standard out (rather than needing to look in logs/dbt.log.
Then the grep -Eo "'send_anonymous_usage_stats': '(.*)'" part finds the log line that shows the config settings that dbt is using.

The output we'd expect is:

'send_anonymous_usage_stats': 'True'
'send_anonymous_usage_stats': 'True'
'send_anonymous_usage_stats': 'False'
'send_anonymous_usage_stats': 'False'
'send_anonymous_usage_stats': 'False'

But depending on the config setting for send_anonymous_usage_stats within profiles.yml, I saw different behavior.

When send_anonymous_usage_stats is either set to True (default), None, or not set within profiles.yml:

'send_anonymous_usage_stats': 'True'
'send_anonymous_usage_stats': 'True'
'send_anonymous_usage_stats': 'True'   ❌ 
'send_anonymous_usage_stats': 'False'
'send_anonymous_usage_stats': 'False'

When send_anonymous_usage_stats is set to False within profiles.yml:

'send_anonymous_usage_stats': 'False'  ❌
'send_anonymous_usage_stats': 'True'
'send_anonymous_usage_stats': 'False'
'send_anonymous_usage_stats': 'False'
'send_anonymous_usage_stats': 'False'

Conclusion

👉 So it looks like the --send-anonymous-usage-stats / --no-send-anonymous-usage-stats CLI flags are being ignored rather than adhering to the standard precedence.

Acceptance criteria

The output from the above commands is always the following, regardless if the send_anonymous_usage_stats setting within profiles.yml is None, True, or False):

'send_anonymous_usage_stats': 'True'
'send_anonymous_usage_stats': 'True'
'send_anonymous_usage_stats': 'False'
'send_anonymous_usage_stats': 'False'
'send_anonymous_usage_stats': 'False'

from dbt-core.

dbeatty10 commented on June 12, 2024 1

Should we raise another issue that if the client is not connected to the internet we should not hang for 30 secs?

It seems reasonable to give the anonymous tracking a quick shot and give up after a very short amount of time (1 or 2 seconds total) if it isn't able to establish a connection.

If you want to open an issue for this, I'll label it as help wanted.

from dbt-core.

dbeatty10 commented on June 12, 2024

I'm not sure if it affects the behavior described in this issue or not, but we take an additional environment variable DO_NOT_TRACK into consideration here:

dbt-core/core/dbt/cli/flags.py

Lines 281 to 283 in 68970d0

 # Support console DO NOT TRACK initiative. 

 if os.getenv("DO_NOT_TRACK", "").lower() in ("1", "t", "true", "y", "yes"): 

 object.__setattr__(self, "SEND_ANONYMOUS_USAGE_STATS", False)

We mostly expose only a single environment variable to control flags / global configs, but this is the rare case (and maybe only one) in which we allow two different environment variable names:

DBT_SEND_ANONYMOUS_USAGE_STATS
DO_NOT_TRACK

According to the click docs here, envvar can also be:

a list of different environment variables where the first one is picked.

Also, click handles the BOOL parameter type like this:

The string values “1”, “true”, “t”, “yes”, “y”, and “on” convert to True. “0”, “false”, “f”, “no”, “n”, and “off” convert to False.

This happens to be exactly what is in the logic below, which would lower the barriers to that portion of the code refactor:

dbt-core/core/dbt/cli/flags.py

Line 282 in 68970d0

if os.getenv("DO_NOT_TRACK", "").lower() in ("1", "t", "true", "y", "yes"):

So we might be able to remove this code in favor of adding this here:

    envvar=["DO_NOT_TRACK", "DBT_SEND_ANONYMOUS_USAGE_STATS"],

At the very least, it would reduce the number of occurrences of os.getenv in the code base (which feels like tech debt). But there's a possibility it would also resolve the undesirable behavior described in this issue.

from dbt-core.

b-per commented on June 12, 2024

I just found out this same issue!

I was running dbt-duckdb on a place (no Internet), and saw that any run was hanging for 30s.

I suspected it was due to tracking so I tried adding --no-send-anonymous-usage-stats, but it was still hanging.
Now trying with no internet and DO_NOT_TRACK=1, it doesn't hang.

This seems related to this value:

dbt-core/core/dbt/tracking.py

Line 59 in afb2d61

 {"buffer_size": 30} if SNOWPLOW_TRACKER_VERSION < Version("0.13.0") else {"batch_size": 30} 

Should we raise another issue that if the client is not connected to the internet we should not hang for 30 secs?

from dbt-core.

[CT-3525] [Bug] Option `--no-send-anonymous-usage-stats` creates different behavior compared to `DBT_SEND_ANONYMOUS_USAGE_STATS=false` about dbt-core HOT 4 OPEN

Comments (4)

Reprex

Conclusion

Acceptance criteria

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	# Support console DO NOT TRACK initiative.
	if os.getenv("DO_NOT_TRACK", "").lower() in ("1", "t", "true", "y", "yes"):
	object.__setattr__(self, "SEND_ANONYMOUS_USAGE_STATS", False)