60 lines
2.3 KiB
Plaintext
60 lines
2.3 KiB
Plaintext
---
|
|
title: Telemetry
|
|
description: A description of the opt-out telemetry implementation in Axolotl.
|
|
---
|
|
|
|
# Telemetry in Axolotl
|
|
|
|
Axolotl implements anonymous telemetry to help maintainers understand how the library
|
|
is used and where users encounter issues. This data helps prioritize features, optimize
|
|
performance, and fix bugs.
|
|
|
|
## Data Collection
|
|
|
|
We collect:
|
|
|
|
- System info: OS, Python version, Axolotl version, PyTorch version, Transformers
|
|
version, etc.
|
|
- Hardware info: CPU count, memory, GPU count and models
|
|
- Runtime metrics: Training progress, memory usage, timing information
|
|
- Usage patterns: Models (from a whitelist) and configurations used
|
|
- Error tracking: Stack traces and error messages (sanitized to remove personal
|
|
information)
|
|
|
|
No personally identifiable information (PII) is collected.
|
|
|
|
## Implementation
|
|
|
|
Telemetry is implemented using PostHog and consists of:
|
|
|
|
- `axolotl.telemetry.TelemetryManager`: A singleton class that initializes the
|
|
telemetry system and provides methods for tracking events.
|
|
- `axolotl.telemetry.errors.send_errors`: A decorator that captures exceptions and
|
|
sends sanitized stack traces.
|
|
- `axolotl.telemetry.runtime_metrics.RuntimeMetricsTracker`: A class that tracks
|
|
runtime metrics during training.
|
|
- `axolotl.telemetry.callbacks.TelemetryCallback`: A Trainer callback that sends
|
|
runtime metrics telemetry.
|
|
|
|
The telemetry system will block training startup for 15 seconds to ensure users are
|
|
aware of data collection, unless telemetry is explicitly enabled or disabled.
|
|
|
|
## Opt-Out Mechanism
|
|
|
|
Telemetry is **enabled by default** on an opt-out basis. To disable it, set either:
|
|
|
|
- `AXOLOTL_DO_NOT_TRACK=1` (Axolotl-specific)
|
|
- `DO_NOT_TRACK=1` (Global standard; see https://consoledonottrack.com/)
|
|
|
|
To acknowledge and explicitly enable telemetry (and remove the warning message), set:
|
|
`AXOLOTL_DO_NOT_TRACK=0`.
|
|
|
|
## Privacy
|
|
|
|
- All path-like config information is automatically redacted from telemetry data
|
|
- Model information is only collected for whitelisted organizations
|
|
- See `axolotl/telemetry/whitelist.yaml` for the set of whitelisted organizations
|
|
- Each run generates a unique anonymous ID
|
|
- This allows us to link different telemetry events in a single same training run
|
|
- Telemetry is only sent from the main process to avoid duplicate events
|