FedDIY/docs/OPEN_QUESTIONS.md

# FeDIY Open Questions

A living document of unresolved design and product questions. When a question is resolved, record the decision as an ADR and remove or archive the entry here.

Each question is tagged with the phase it blocks or most affects: [P0], [P1], [P2], [P3], [P4].

## Resolved Decisions

- A "project" is a defined work process.
- A project includes a list of materials/ingredients, a list of required tools, and step-by-step ordered instructions.
- Steps may include embedded media hosted on-instance or linked from external sources.
- Projects may include one or more external canonical links (for example homepage, repository, or source publication).
- Explicit project versioning is preferred and should be supported.
- A project is composable: FeDIY provides a minimal core model, and instances can tailor project detail via optional domain-specific extensions.
- FeDIY should not require first-party implementation of every domain detail up front; expert communities can define richer schemas over time.

## Upfront Clarification Plan (P0 -> Early P1)

The goal is to remove ambiguity before implementation while keeping scope realistic.

1. Decision track A: Project revision model

- Resolve Q1 first.
- Output: one ADR defining draft/publish/supersede behavior and history visibility.
- Success signal: API and UI contracts can assume a stable project lifecycle.

1. Decision track B: Core-plus-extension contract

- Resolve Q2, Q2a, Q2b, and Q2c as one coherent contract.
- Output: one ADR for core project fields and one ADR for extension payload shape/namespacing/discovery.
- Success signal: instances can add domain-specific detail (knitting, 3D print, electronics, etc.) without changing core semantics.

1. Decision track C: Search/index policy for extensions

- Resolve the extension-indexing part of Q2c with Q23.
- Output: indexing policy doc or ADR that separates required indexed fields from opaque extension fields.
- Success signal: predictable search behavior across instances with different extensions.

1. Decision track D: API stability and evolution

- Resolve Q10 and Q11 before broad client development.
- Output: API versioning/deprecation policy with compatibility guarantees for third-party clients.
- Success signal: extension evolution does not break existing clients unexpectedly.

---

## Domain Model

**Q1 [P1]** What should the explicit versioning model look like?

- Immutable numbered revisions, mutable drafts, or both?
- Does each version snapshot materials, tools, steps, media references, and extension data?
- What is the publish workflow for a new version (draft -> published -> superseded)?
- How should old versions be exposed in API and UI (full history vs selected milestones)?

**Q2 [P1]** How are materials modelled?

- Free-text only, or linked to a shared taxonomy/catalog?
- Do quantities and units need to be structured (for search/filtering), or is prose sufficient for MVP?

**Q2a [P1]** How are required tools modelled?

- Free-text tools list only, or a normalized/tool taxonomy strategy?
- Should tools support optional metadata (skill level, safety notes, alternatives)?

**Q2b [P1]** How are canonical external links modelled and validated?

- Single canonical link or multiple typed links (homepage, repository, video, reference)?
- Are external links version-scoped or project-scoped?
- What URL validation and safety checks are required?

**Q2c [P1/P2]** What is the extension model for domain-specific project data?

- What extension shape is supported first: typed JSON blocks, namespaced key/value fields, or plugin-defined schemas?
- How are extensions identified and namespaced to avoid collisions across instances?
- How do API and UI clients discover which extensions are present on a project?
- Which extension fields are indexed for search, and which remain opaque?
- What validation guarantees does the server provide for extension payloads?

**Q3 [P1]** What is the tag and category strategy?

- Folksonomy (free-form user tags), curated taxonomy, or both?
- Is there a category hierarchy (craft type → technique → material)?
- Who controls the taxonomy on a given instance?

**Q4 [P1]** What licences can a project carry?

- Does the platform enforce a licence selection, or is it free-text?
- Does the platform's own CC BY-SA 4.0 licence apply to user content by default, or is that a separate question?

**Q5 [P3/P4]** What are the rules for project forks and remixes?

- Can a user fork another user's project (including from a remote instance)?
- Does a fork maintain a reference/attribution link to the original?
- What licence constraints apply to remixing?

---

## Identity and Authentication

**Q6 [P1]** What is the authentication mechanism?

- Local username/password (with secure hashing), passkeys, OAuth2/OIDC third-party providers, magic-link email, or a combination?
- Is email verification required for account creation?

**Q7 [P1]** What account fields are required beyond the ActivityPub actor minimum?

- Display name, bio, avatar, location — which are required, optional, or omitted entirely?
- Are there craft-specific profile fields (preferred crafts, skill level)?

**Q8 [P2]** Can remote ActivityPub actors interact without a local account?

- Can a remote actor follow a local user, like a project, or comment, without registering locally?
- What local data is persisted for a remote actor who interacts?

**Q9 [P1]** What is the session and token strategy for the API?

- Short-lived JWTs, opaque bearer tokens with a refresh flow, or server-side sessions?
- How are tokens invalidated (logout, account suspension)?

---

## API Design

**Q10 [P1]** What API style?

- REST with JSON, or something else (GraphQL, JSON:API)?
- Is there value in JSON:API's sparse fieldsets and relationship includes for a project-browsing use case?

**Q11 [P1]** What is the API versioning strategy?

- URL path prefix (`/api/v1/`), `Accept` header versioning, or no versioning until a breaking change forces it?

**Q12 [P1]** What is the pagination strategy?

- Cursor-based (stable under concurrent inserts), offset-based, or keyset pagination?
- What is the default and maximum page size?

**Q13 [P1]** What rate-limiting and abuse-prevention strategy applies to the API?

- Per-IP, per-authenticated-user, or both?
- Does this apply equally to federation endpoints?

---

## Front-End / Bundled UI

**Q14 [P1]** What technology powers the bundled web UI?

- Vanilla JS / progressive enhancement (HTMX or similar), a Rust WASM front-end framework, or a JS framework (e.g. Svelte, Vue)?
- Does the choice live in the same Cargo workspace or a separate directory with its own build tooling?

**Q15 [P1]** What is the no-JavaScript fallback scope?

- Read-only browsing (project pages, search results) without JS is desirable.
- Authoring without JS is probably out of scope — is that an explicit decision?

**Q16 [P1]** What is the accessibility baseline and implementation strategy?

- WCAG 2.1 AA is the minimum standard. How do we validate compliance and maintain it over time?
- What alt text strategy is required for all media (project images, step-by-step photos, diagrams, charts)? Do we enforce it at upload time or provide tools for post-hoc addition?
- What captions and transcripts strategy for audio/video content? Who is responsible for creation?
- What text-to-speech capabilities should be built into the API and UI? Should the API provide structured step data (title, description, images) in a format suitable for TTS clients?
- How do we support blind and low-vision users in the bundled UI? Screen reader testing and semantic HTML are baseline; what else?
- How do we support Deaf and hard-of-hearing users? Captions for instructional videos, visual indicators for audio cues, readable transcripts.
- How do we support users with motor differences? Full keyboard navigation, no time-dependent interactions, adjustable interface sizes/spacing.
- What color contrast requirements apply to project images and diagrams submitted by users? Can we provide tools or guidance to improve accessibility?
- Are there craft-community-specific accessibility concerns (e.g., tactile or kinesthetic learning for certain crafts, accessible alternatives for physical demonstrations)?
- What is the accessibility review process for features and content? How are accessibility issues prioritized?

**Q16a [P1]** What reading-comfort customization options does the bundled UI provide?

- What font choices are offered? At minimum, the UI should offer a dyslexia-friendly typeface (e.g., OpenDyslexic, Atkinson Hyperlegible) alongside a standard option.
- Should the full base font be user-overridable (including via browser/OS settings and user stylesheets)? The bundled UI must not block this.
- What line-height, letter-spacing, word-spacing, and paragraph-width adjustments are available? (Wide spacing and shorter line length aid many readers.)
- Are text size controls per-user and persistent (stored in account preferences), or session-only?
- Should there be a low-visual-stress color theme (off-white/cream backgrounds, reduced contrast)? Some users with dyslexia or visual stress find high-contrast black-on-white harder to read.
- How do user-defined display preferences interact with instance theming? User preferences must take precedence.
- Are reading-comfort settings exposed via API so third-party clients can retrieve and honor them?

**Q16b [P1]** What is the localization (i18n/l10n) strategy?

- What locale data format is used for UI strings? (gettext PO, JSON, TOML, ICU MessageFormat?)
- What is the language tagging convention for user-generated content? BCP 47 language tags are assumed — is this confirmed?
- Does the platform support RTL scripts (Arabic, Hebrew, Persian, etc.) from the first UI pass? CSS logical properties are required if so.
- What locale-sensitive formatting is required at MVP? (dates, times, numbers, units of measure)
- How are translations contributed by the community? Are translation files in the main repo or a separate project?
- What is the fallback behavior when a user's preferred locale is not available for a piece of content?
- Does the search index need locale-specific text analysis (stemming, tokenization) configured per language?
- Does the ActivityPub object for a project carry `@language` or equivalent metadata to signal content language to federated instances?
- Is machine translation (MT) in scope? If so, is it instance-opt-in, per-user opt-in, or always-on?

---

## Federation and ActivityPub

**Q17 [P2]** How do FeDIY project objects map to ActivityPub types?

- Use `Note` or `Article` for broad compatibility, or define a custom `FeDIYProject` type?
- If custom, what fallback representation do we provide for clients that don't understand it?

**Q18 [P2]** What ActivityPub activity types does FeDIY support in phase 2?

- Minimum set: `Create`, `Update`, `Delete`, `Follow`, `Accept`, `Reject`, `Undo`.
- Phase 2 or later: `Like`, `Announce` (boost), `Flag` (report)?

**Q19 [P2]** What is the media federation strategy?

- Do media attachments (images, files) federate as links to the canonical origin, or are they replicated locally?
- How are broken/unavailable remote media handled in the UI?

**Q20 [P2]** What is the HTTP Signatures key lifecycle?

- Per-actor keypairs (standard), or instance-level signing with `keyId` delegation?
- Key rotation: when and how are keys rotated, and how are remote instances notified?

**Q21 [P2]** Which well-known endpoints are in scope for phase 2?

- WebFinger (required for actor discovery).
- NodeInfo (instance metadata for compatibility and listing services).
- `/.well-known/oauth-authorization-server` if OIDC is supported.

**Q22 [P2]** How is WebFinger structured for FeDIY entities?

- Actors are users: `acct:user@instance` — standard.
- Do projects also have addressable AP identities, or do they belong to the author actor?

---

## Search and Discovery

**Q23 [P1]** What is the full-text search implementation?

- PostgreSQL full-text search (zero extra infra), an embedded engine (Tantivy via Rust), or an external service (Meilisearch, Elasticsearch)?
- What fields are indexed: title, description, steps, tags, materials?

**Q24 [P2]** Does search span federated content?

- Phase 1: local content only.
- Phase 2+: do we query remote instances, or build a local index of federated objects we've received?

---

## Media Storage

**Q25 [P1]** How are user-uploaded media assets stored?

- Local filesystem, S3-compatible object storage, or both with a configurable backend?
- What is the maximum file size and permitted formats for MVP?

**Q26 [P1]** Is image processing (resize, thumbnail, format conversion) in-process or delegated?

- In-process with a Rust image library, or delegated to an external service/worker?

---

## Persistence

**Q27 [P0/P1]** What is the database?

- PostgreSQL is the obvious choice given full-text search, JSONB for flexible AP object storage, and broad hosting support — is this confirmed?
- Is SQLite a supported option for lightweight self-hosting, or is that complexity not worth it?

**Q28 [P1]** What is the database migration strategy?

- A Rust migration library (sqlx migrate, refinery), or a standalone tool (Flyway, Liquibase)?

---

## Moderation

**Q29 [P1]** How are moderator roles assigned and scoped?

- Instance admin assigns moderators manually; no self-service promotion.
- Are there multiple moderation tiers (e.g. content moderator vs instance admin)?

**Q29a [P1]** What user-level personal moderation tools are provided, and how are they represented in the data model?

- A user must be able to block specific local and remote accounts without any moderator involvement. What is the API shape for a user-level block?
- A user must be able to block an entire remote instance. Does this map to an ActivityPub `Block` activity, a local filter record, or both?
- Are user-level mutes (suppress content without blocking) distinct from blocks in the data model?
- What keyword and wildcard filter capabilities are supported? Are filters applied server-side (content never delivered to client) or client-side (UI suppresses matching content)? Both options should be supportable.
- When a user blocks another, is the blocked party notified? (Convention in AP-based platforms is not to notify.)
- Do user-level blocks prevent the blocked party from seeing the user's public content, or only prevent interaction?
- How are user-level moderation actions represented in the user's own data export (GDPR portability)?

**Q29b [P1/P2]** What is the data model and AP representation for shareable block and recommendation lists?

- Shareable block lists: what format is used for export and import? JSON-LD? A well-known community format (e.g., Oliphant CSV, FediBlock)?
- Can a block list be published as a live federated resource that subscribers can follow and receive updates from?
- Recommendation lists: are these a first-class AP object type (e.g., `OrderedCollection` of `Project` objects), or a local-only feature?
- Personal collections/bookmarks: are these private by default with an explicit publish action, or public by default?
- What privacy model applies to list subscriptions? Can a user see who has subscribed to their block list?
- How does subscribing to another user's block list interact with the subscriber's own moderation decisions? (Additive by default; subscriber retains override.)
- Are community-maintained lists (collaboratively edited by a group) in scope, and if so, what is the authorship/edit governance model?
- Can lists be versioned or snapshotted so a subscriber can audit what changed between updates?

**Q30 [P3]** Do we participate in shared blocklists (e.g. FIRES, Oliphant tiers)?

- Subscribe to external block lists automatically, or manual import only?
- Is this a phase 3 concern or deferred entirely?
- Does the user-level list subscription mechanism (Q29b) subsume this, or is it a separate instance-level concern?

**Q31 [P3]** What is the user-facing report and appeals workflow?

- Can a user see the status of their own report?
- Is there a formal appeals process, or at moderator discretion?

---

## Deployment and Operations

**Q32 [P0/P1]** What is the primary deployment target?

- Single static binary + external PostgreSQL (simplest self-hosting).
- OCI/Docker container image.
- NixOS module.
- All three, or a prioritised subset?

**Q33 [P1]** What are the minimum self-hosting requirements?

- RAM, CPU, disk, and network minimums for a small instance.
- Is there a single-binary mode with embedded SQLite for hobbyist hosting (see Q27)?

**Q34 [P1]** What is the configuration strategy?

- Environment variables only, a config file (TOML), or both?
- What must be configurable per-instance (instance name, federation policy, storage backend, SMTP, etc.)?

**Q35 [P4]** What observability stack is expected?

- Structured logging to stdout (12-factor), Prometheus metrics endpoint, OpenTelemetry traces?
- Are these required at launch or added progressively?

---

## Content and Community Policy

**Q36 [P0]** Does the platform define baseline content guidelines beyond what moderation tooling enforces?

- What categories of content are prohibited regardless of instance policy?
- CSAM must be prohibited and reported — is there a mechanism planned?

**Q37 [P4]** Are collaborative/co-authorship workflows in scope?

- Can multiple accounts be listed as co-authors of a project?
- Is there a contribution workflow (pull-request style) or trust-based co-author invite?

---

## Privacy and Legal Compliance

**Q38 [P0/P1]** What personal data does FeDIY collect and what is the lawful basis for each category?

- Account registration data (email, display name, password hash): processed under contract. What is the minimum required set?
- IP address and access logs: is there a documented retention window and purge policy before Phase 1 launches?
- Content data (projects, drafts, media): processed under contract. Drafts are private; what is the data model distinction between draft and published that ensures privacy?
- Session and authentication tokens: what is the TTL and purge-on-logout policy?
- Does the platform ever use personal data for purposes beyond what is needed to operate the service (e.g. analytics, recommendations)? If so, is consent obtained?

**Q39 [P1]** What does the right-to-access (GDPR Article 15) export look like?

- What data categories are included in a user's full export: account profile, published projects, draft projects, interactions (follows, likes, bookmarks, block lists), moderation history affecting the user, session and audit records?
- What format is the export? JSON is required; is a human-readable HTML or PDF summary also provided?
- Is the export self-service (user-initiated via UI and API) or does it require admin action?
- What is the SLA for delivering an export? GDPR requires response within one month.
- How are exports of federated data handled — i.e. data about the user that exists on remote instances? The export must explain this limitation.

**Q40 [P1]** What does account deletion (GDPR Article 17 — right to erasure / right to be forgotten) entail?

- What is the deletion workflow? Must be self-service, not admin-gated.
- What data is fully erased: credentials, profile fields, private drafts, session tokens, email address, IP logs?
- What is tombstoned rather than erased: publicly published content that is referenced by others? The privacy notice must explain the tombstone policy.
- What moderation records are retained in anonymised form for legal/safety obligations, and for how long?
- How long after a deletion request is the data fully purged? A maximum window (e.g. 30 days) must be defined and disclosed.
- How is deletion propagated to federated instances (GDPR Art. 17(2))? An ActivityPub `Delete` activity is sent to known peers; remote compliance cannot be guaranteed. The privacy notice must say this.
- Is there a delivery-receipt log for `Delete` activities sent to federated peers, retained for operator accountability (separate from user personal data)?
- Is there a deletion-in-progress state visible to the user while federation propagation is completing?
- Does the architecture support deletion of individual published items (a single project, a comment) without requiring full account closure?
- Is legal hold (suspension of deletion during active investigation or legal proceedings) required? If so, how is the user notified?
- For financial/transaction records, what is the operator-configurable retention window to satisfy accounting law?

**Q40a [P1]** How does the platform support compliance with US state privacy laws alongside GDPR?

- CCPA/CPRA (California): deletion SLA is 45 days (extendable to 90). Does the system support configurable SLA windows per instance?
- CCPA/CPRA: right to opt-out of sale/sharing of personal data. FeDIY does not sell data, but does the architecture support the Global Privacy Control (GPC) signal as an opt-out mechanism?
- CCPA/CPRA: right to correct inaccurate personal information. Is this satisfied by standard profile editing?
- CCPA/CPRA: right to know (categories of data collected, sources, purposes, third parties). Is this satisfied by the privacy notice and data export?
- Virginia VCDPA, Colorado CPA, Connecticut CTDPA, Texas TDPSA and others: broadly equivalent deletion and portability rights with 45-day SLAs. Does the jurisdiction-agnostic deletion workflow satisfy all of these?
- Colorado CPA specifically: Global Privacy Control signals must be honored. Is GPC processing in scope for Phase 1?
- Brazil LGPD Article 18: right to deletion of unnecessary or unlawfully processed data, equivalent in scope to GDPR Art. 17. Is the architecture jurisdiction-agnostic enough to satisfy this without custom logic?
- Canada PIPEDA (and forthcoming CPPA / Bill C-27): limited right to deletion currently; architecture should anticipate the stronger rights in Bill C-27 when enacted. What future-proofing is needed?
- UK GDPR: equivalent to EU GDPR; same architecture satisfies it, but does the template privacy notice need UK-specific customisation (ICO contact details, UK law references)?

**Q40b [P1]** What is the operator guidance for jurisdiction-specific compliance customisation?

- Which compliance settings are configurable per-instance (SLA window, data categories, GPC handling, legal hold policy)?
- Does FeDIY provide a compliance checklist for operators launching a public instance?
- Does the template privacy notice include jurisdiction-specific variants (EU, UK, California, Brazil) or a single configurable template?
- How does FeDIY communicate clearly that the instance operator is the data controller and bears primary legal responsibility?

**Q41 [P1]** What are the data retention periods for each category?

- Authentication and failed-login logs: short retention (30–90 days suggested); is this configurable by instance operators?
- IP address logs: are these stored at all beyond ephemeral request processing? If so, what is the maximum retention window?
- Deleted account data: what is the maximum time between deletion request and full purge of identifiable data?
- Moderation records: anonymised retention for legal purposes — what anonymisation process is applied and for how long are they kept?
- Inactive account data: is there a policy for purging accounts that have never been activated or have been dormant for an extended period?

**Q42 [P1]** What is the data portability (Article 20) export format, and can it be imported?

- Is the export format capable of round-tripping into another FeDIY instance (i.e. migrate your account and projects to a different instance)?
- Does the export include ActivityPub actor identity in a way that helps federated peers update their records after a migration?
- Is account migration (move actor from instance A to instance B, with follower redirect) in scope? If so, which phase?

**Q43 [P0/P1]** What is the privacy notice strategy for instance operators?

- FeDIY provides a template privacy notice that operators must customise. What is the minimum required content?
- Is the privacy notice served at a well-known URL (`/privacy`) before the instance accepts registrations?
- How does the platform ensure that operators have published a privacy notice? (e.g. configuration check at startup, NodeInfo metadata)
- Who is the data controller for a self-hosted instance? The instance operator. Is this clearly communicated in the code and documentation?

**Q44 [P1]** What is the children's privacy policy?

- What minimum age is required to register? (GDPR requires parental consent for under-16 in most EU member states; COPPA requires age 13 in the US.)
- Is age verification self-declaration only, or is a stronger mechanism required?
- What happens if a minor's account is reported? Is there a defined response process?

**Q45 [P2]** How are federated data subjects' rights handled?

- If a user on a remote instance requests erasure of data held locally (e.g. cached profile, received activities), what is the process?
- Does receiving a `Delete` activity for a remote actor trigger a purge of all locally cached data about that actor?
- Are there GDPR obligations that apply to data received via federation from instances in different jurisdictions?