feat: Document personal data handling and lawful basis in architecture and open questions
This commit is contained in:
+91
-2
@@ -234,8 +234,97 @@ FeDIY is designed with privacy as a first-class concern. The architecture must s
|
|||||||
|
|
||||||
### Lawful Basis for Processing
|
### Lawful Basis for Processing
|
||||||
|
|
||||||
- Account data is processed on the basis of contract (the user's agreement to the terms of service).
|
### Personal Data Register
|
||||||
- Where consent is required (e.g. non-essential cookies, marketing communications), it is obtained explicitly, recorded, and revocable.
|
|
||||||
|
All personal data collected by FeDIY instances is documented here. Every field has a stated purpose and lawful basis. No field is collected without both.
|
||||||
|
|
||||||
|
#### Required account fields (processed under contract)
|
||||||
|
|
||||||
|
| Field | What is stored | Purpose | Retention |
|
||||||
|
|---|---|---|---|
|
||||||
|
| Email address | Plaintext (normalised) | Account recovery, notifications, operator contact | Duration of account; deleted on erasure |
|
||||||
|
| Password | Argon2id hash only — plaintext never stored or logged | Authentication | Duration of account; deleted on erasure |
|
||||||
|
| Handle (`@user@instance`) | Plaintext; URL-safe string | AP actor identity, addressability, federation | Duration of account; old handles redirect to new after a change; deleted on erasure |
|
||||||
|
| Display name | Plaintext; user-chosen, pseudonym allowed | Human-readable identity in UI and AP objects | Duration of account; deleted on erasure |
|
||||||
|
| Minimum age verified | Boolean (`true`/`false`) — raw date of birth is **not stored** | Compliance with COPPA/GDPR Art. 8 age gate; raw DOB is used once at registration to derive this flag and then discarded | Duration of account; deleted on erasure |
|
||||||
|
| Account creation timestamp | UTC timestamp | Audit, legal compliance | Duration of account; may be retained in anonymised moderation records after deletion |
|
||||||
|
|
||||||
|
#### Optional profile fields (processed under contract — user chose to provide them)
|
||||||
|
|
||||||
|
| Field | Purpose | Retention |
|
||||||
|
|---|---|---|
|
||||||
|
| Bio / about text | Public self-description | Duration of account; deleted on erasure |
|
||||||
|
| Avatar image | Visual identity in UI and AP actor object | Duration of account; deleted on erasure |
|
||||||
|
| Header / banner image | Profile page decoration | Duration of account; deleted on erasure |
|
||||||
|
| Location (free text, not geocoded) | Community context; user-declared, not verified | Duration of account; deleted on erasure |
|
||||||
|
| Preferred crafts / interests | Discovery and personalisation | Duration of account; deleted on erasure |
|
||||||
|
| Pronouns | Respectful interaction | Duration of account; deleted on erasure |
|
||||||
|
| External links (website, social profiles) | Attribution and cross-platform identity | Duration of account; deleted on erasure |
|
||||||
|
| Preferred locale | UI language and formatting | Duration of account; deleted on erasure |
|
||||||
|
| Display preferences (font, size, spacing, contrast) | Reading comfort and accessibility | Duration of account; deleted on erasure |
|
||||||
|
|
||||||
|
#### Session and authentication data (processed under contract)
|
||||||
|
|
||||||
|
| Field | What is stored | Purpose | Retention |
|
||||||
|
|---|---|---|---|
|
||||||
|
| Session token | Opaque cryptographic token (server-side record) | Authenticated API access | Purged on logout; purged on expiry; all tokens purged on account deletion |
|
||||||
|
| Token expiry | Timestamp | Session lifecycle management | Purged with token |
|
||||||
|
| Security event log | Timestamp + account ID + event type (login, logout, failed login, password change) — **no IP address** | Audit trail for account security events | Short retention (30 days); purged on account deletion |
|
||||||
|
|
||||||
|
#### IP addresses
|
||||||
|
|
||||||
|
**IP addresses are never written to persistent storage.** They are present in the request context during processing and discarded when the request completes. Brute-force and abuse detection uses in-memory rate limiting scoped to the running process, not a persistent IP log.
|
||||||
|
|
||||||
|
This is a deliberate data minimisation decision. The privacy notice must state it explicitly as a feature of the platform's approach.
|
||||||
|
|
||||||
|
#### Content data (processed under contract)
|
||||||
|
|
||||||
|
| Data | Notes | Retention |
|
||||||
|
|---|---|---|
|
||||||
|
| Published projects (all fields, media, steps) | Publicly visible; federated to AP peers | Tombstoned on deletion (not fully erased if referenced by others); see Right to Erasure section |
|
||||||
|
| Draft projects | Private; never federated; not visible to other users | Fully deleted immediately on account deletion or on user request |
|
||||||
|
| Media attachments | Images, files uploaded to the instance | Deleted with the content they belong to |
|
||||||
|
| Tags, materials, tools associated with projects | Part of the project record | Same lifecycle as the project |
|
||||||
|
|
||||||
|
#### Interaction data (processed under contract)
|
||||||
|
|
||||||
|
| Data | Notes | Retention |
|
||||||
|
|---|---|---|
|
||||||
|
| Follows (outgoing and incoming) | Social graph; federated as AP Follow/Accept activities | Deleted on account deletion; unfollow activity sent to peers |
|
||||||
|
| Likes / favourites | Interaction record | Deleted on account deletion |
|
||||||
|
| Bookmarks and personal collections | Private by default | Deleted on account deletion; included in data export |
|
||||||
|
| Block list | Personal moderation data | Deleted on account deletion; included in data export |
|
||||||
|
| Mute list | Personal moderation data | Deleted on account deletion; included in data export |
|
||||||
|
| Keyword filters | Personal moderation data | Deleted on account deletion; included in data export |
|
||||||
|
|
||||||
|
#### Moderation and safety records (lawful basis: legal obligation / legitimate interest)
|
||||||
|
|
||||||
|
| Data | Notes | Retention |
|
||||||
|
|---|---|---|
|
||||||
|
| Reports filed by a user | User's own report history | Included in data export; deleted on account deletion (report content retained in anonymised form) |
|
||||||
|
| Moderation actions taken against a user | Actions, outcomes, dates | Anonymised at account deletion time (personal identifiers removed, safety record retained for legal compliance) |
|
||||||
|
| CSAM / NCII violation records | Anonymised record of confirmed violations and reports filed | Retained for operator's legal compliance obligations regardless of account status |
|
||||||
|
|
||||||
|
#### Federated / remote actor data (lawful basis: legitimate interest — necessary to operate the federation)
|
||||||
|
|
||||||
|
| Data | Notes | Retention |
|
||||||
|
|---|---|---|
|
||||||
|
| Remote actor profile cache | Handle, display name, AP actor URL, public key | Retained while needed for federation processing; purged when a `Delete` activity is received for the actor |
|
||||||
|
| Received AP activities | Cached copies of federated content from remote users | Retained while operationally needed; purged on receipt of `Delete` |
|
||||||
|
|
||||||
|
#### Analytics (lawful basis: legitimate interest — only if truly anonymised)
|
||||||
|
|
||||||
|
- First-party analytics are in scope, but must be **truly aggregate and anonymised** — not pseudonymised per-user event streams.
|
||||||
|
- Aggregate statistics (daily active users, most-viewed projects, search term frequency) that cannot be linked to any individual are not personal data under GDPR and require no consent.
|
||||||
|
- If per-user behavioural events are ever collected — even temporarily before aggregation — they become personal data at the point of collection and require explicit consent.
|
||||||
|
- The default configuration ships with analytics **off**. Operators enable it and are responsible for ensuring their approach stays within the anonymised boundary or obtains the required consent.
|
||||||
|
|
||||||
|
### Lawful Basis Summary
|
||||||
|
|
||||||
|
- **Contract**: required account fields, optional profile fields, session data, content data, interaction data.
|
||||||
|
- **Legal obligation**: age verification, CSAM/NCII reporting records, moderation records for legal compliance.
|
||||||
|
- **Legitimate interest**: federated actor cache, security event log, truly anonymised analytics.
|
||||||
|
- **Consent**: per-user behavioural analytics if ever collected; any future non-essential processing not covered above.
|
||||||
- Processing logs record the lawful basis used for each data category.
|
- Processing logs record the lawful basis used for each data category.
|
||||||
|
|
||||||
### Right to Access (Article 15)
|
### Right to Access (Article 15)
|
||||||
|
|||||||
@@ -16,6 +16,8 @@ Each question is tagged with the phase it blocks or most affects: [P0], [P1], [P
|
|||||||
- Materials are also extensible entities. The core material record (name, quantity, unit) can carry domain-specific extension payloads using the same mechanism as project extensions. Community-defined material type schemas (e.g. yarn, filament, PCB component) can be layered on without modifying the core model.
|
- Materials are also extensible entities. The core material record (name, quantity, unit) can carry domain-specific extension payloads using the same mechanism as project extensions. Community-defined material type schemas (e.g. yarn, filament, PCB component) can be layered on without modifying the core model.
|
||||||
- **Database: PostgreSQL is the primary persistence target.** JSONB, native full-text search, transactional consistency under concurrent federation fan-in, and broad managed hosting support make it the clear long-term fit. The persistence layer is behind a repository abstraction (trait-based interfaces), which keeps business logic independent of the database driver and leaves SQLite viable as a future lightweight self-hosting option without requiring changes to domain logic. See [ADR TBD: Persistence Layer Architecture].
|
- **Database: PostgreSQL is the primary persistence target.** JSONB, native full-text search, transactional consistency under concurrent federation fan-in, and broad managed hosting support make it the clear long-term fit. The persistence layer is behind a repository abstraction (trait-based interfaces), which keeps business logic independent of the database driver and leaves SQLite viable as a future lightweight self-hosting option without requiring changes to domain logic. See [ADR TBD: Persistence Layer Architecture].
|
||||||
- **Baseline content prohibitions (hardcoded, not operator-configurable):** CSAM, doxxing, and non-consensual intimate imagery (NCII) are prohibited on any FeDIY instance regardless of operator policy. The guiding principle is **consent**: minors cannot consent to sexual content; individuals have not consented to having their private identifying information published; subjects of intimate imagery have not consented to its distribution. Enforcement is in-code as far as technically feasible (hash-matching for CSAM and NCII; upload-time pattern detection and mandatory human-review tooling for doxxing). Weapons, violence, and similar dual-use content are **not** hardcoded prohibitions — legitimate DIY projects (fireworks, blacksmithing, knife-making) are indistinguishable at the software level and are handled by operator content policy and community moderation.
|
- **Baseline content prohibitions (hardcoded, not operator-configurable):** CSAM, doxxing, and non-consensual intimate imagery (NCII) are prohibited on any FeDIY instance regardless of operator policy. The guiding principle is **consent**: minors cannot consent to sexual content; individuals have not consented to having their private identifying information published; subjects of intimate imagery have not consented to its distribution. Enforcement is in-code as far as technically feasible (hash-matching for CSAM and NCII; upload-time pattern detection and mandatory human-review tooling for doxxing). Weapons, violence, and similar dual-use content are **not** hardcoded prohibitions — legitimate DIY projects (fireworks, blacksmithing, knife-making) are indistinguishable at the software level and are handled by operator content policy and community moderation.
|
||||||
|
- **Baseline content prohibitions (hardcoded, not operator-configurable):** CSAM, doxxing, and non-consensual intimate imagery (NCII) are prohibited on any FeDIY instance regardless of operator policy. The guiding principle is **consent**: minors cannot consent to sexual content; individuals have not consented to having their private identifying information published; subjects of intimate imagery have not consented to its distribution. Enforcement is in-code as far as technically feasible (hash-matching for CSAM and NCII; upload-time pattern detection and mandatory human-review tooling for doxxing). Weapons, violence, and similar dual-use content are **not** hardcoded prohibitions — legitimate DIY projects (fireworks, blacksmithing, knife-making) are indistinguishable at the software level and are handled by operator content policy and community moderation.
|
||||||
|
- **Personal data register (Q38):** Full register in ARCHITECTURE.md. Required registration fields: email, password hash, handle, display name, minimum-age-verified boolean (raw DOB discarded after age check). IP addresses never stored — ephemeral only. Optional profile fields (bio, avatar, header image, location, preferred crafts, pronouns, external links, locale, display preferences) all under contract. Analytics must be truly aggregate/anonymised — per-user event streams require consent. Handles are changeable with a redirect from old to new URL.
|
||||||
|
|
||||||
## Upfront Clarification Plan (P0 -> Early P1)
|
## Upfront Clarification Plan (P0 -> Early P1)
|
||||||
|
|
||||||
@@ -372,12 +374,9 @@ Decision: CSAM, doxxing, and NCII are hardcoded prohibitions enforced in code as
|
|||||||
## Privacy and Legal Compliance
|
## Privacy and Legal Compliance
|
||||||
|
|
||||||
**Q38 [P0/P1]** What personal data does FeDIY collect and what is the lawful basis for each category?
|
**Q38 [P0/P1]** What personal data does FeDIY collect and what is the lawful basis for each category?
|
||||||
|
**Q38 [P0/P1]** ~~What personal data does FeDIY collect and what is the lawful basis for each category?~~ **RESOLVED — see Resolved Decisions and ARCHITECTURE.md Personal Data Register.**
|
||||||
|
|
||||||
- Account registration data (email, display name, password hash): processed under contract. What is the minimum required set?
|
Decision: Full personal data register documented in ARCHITECTURE.md. Required registration fields: email, password hash, handle, display name, minimum-age-verified boolean (raw DOB discarded after age check). IP never stored. Optional profile fields under contract. Analytics must be truly aggregate/anonymised. Handles changeable with redirect.
|
||||||
- IP address and access logs: is there a documented retention window and purge policy before Phase 1 launches?
|
|
||||||
- Content data (projects, drafts, media): processed under contract. Drafts are private; what is the data model distinction between draft and published that ensures privacy?
|
|
||||||
- Session and authentication tokens: what is the TTL and purge-on-logout policy?
|
|
||||||
- Does the platform ever use personal data for purposes beyond what is needed to operate the service (e.g. analytics, recommendations)? If so, is consent obtained?
|
|
||||||
|
|
||||||
**Q39 [P1]** What does the right-to-access (GDPR Article 15) export look like?
|
**Q39 [P1]** What does the right-to-access (GDPR Article 15) export look like?
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user