Files
FedDIY/docs/OPEN_QUESTIONS.md
T
Matsubaa 941a9da928 feat: Update ROADMAP with personal data handling and user moderation features
- Added goals for defining personal data categories and retention obligations.
- Included exit criteria for user-level moderation, personal collections, and self-service data export.
- Expanded Phase 2 goals to include remote actor moderation and shareable block lists.

chore: Update flake.nix to specify main program

- Set the main program for the project in the flake configuration.

feat: Add issue templates for bug reports, feature requests, and questions

- Created structured templates to streamline issue reporting and feature suggestions.

docs: Add pull request template for consistent contributions

- Introduced a PR template to guide contributors on providing necessary information.

docs: Establish a Code of Conduct for community behavior

- Implemented a Code of Conduct to promote a respectful and inclusive environment.

docs: Create Diversity, Equity, and Inclusion (DEI) statement

- Outlined commitment to diversity and inclusion within the FeDIY community.

docs: Define Code Review Guidelines for constructive feedback

- Established guidelines to ensure respectful and effective code reviews.

docs: Implement Security Policy for vulnerability reporting

- Created a security policy detailing how to report vulnerabilities and our commitment to addressing them.
2026-05-23 17:26:28 -05:00

434 lines
25 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# FeDIY Open Questions
A living document of unresolved design and product questions. When a question is resolved, record the decision as an ADR and remove or archive the entry here.
Each question is tagged with the phase it blocks or most affects: [P0], [P1], [P2], [P3], [P4].
## Resolved Decisions
- A "project" is a defined work process.
- A project includes a list of materials/ingredients, a list of required tools, and step-by-step ordered instructions.
- Steps may include embedded media hosted on-instance or linked from external sources.
- Projects may include one or more external canonical links (for example homepage, repository, or source publication).
- Explicit project versioning is preferred and should be supported.
- A project is composable: FeDIY provides a minimal core model, and instances can tailor project detail via optional domain-specific extensions.
- FeDIY should not require first-party implementation of every domain detail up front; expert communities can define richer schemas over time.
## Upfront Clarification Plan (P0 -> Early P1)
The goal is to remove ambiguity before implementation while keeping scope realistic.
1. Decision track A: Project revision model
- Resolve Q1 first.
- Output: one ADR defining draft/publish/supersede behavior and history visibility.
- Success signal: API and UI contracts can assume a stable project lifecycle.
1. Decision track B: Core-plus-extension contract
- Resolve Q2, Q2a, Q2b, and Q2c as one coherent contract.
- Output: one ADR for core project fields and one ADR for extension payload shape/namespacing/discovery.
- Success signal: instances can add domain-specific detail (knitting, 3D print, electronics, etc.) without changing core semantics.
1. Decision track C: Search/index policy for extensions
- Resolve the extension-indexing part of Q2c with Q23.
- Output: indexing policy doc or ADR that separates required indexed fields from opaque extension fields.
- Success signal: predictable search behavior across instances with different extensions.
1. Decision track D: API stability and evolution
- Resolve Q10 and Q11 before broad client development.
- Output: API versioning/deprecation policy with compatibility guarantees for third-party clients.
- Success signal: extension evolution does not break existing clients unexpectedly.
---
## Domain Model
**Q1 [P1]** What should the explicit versioning model look like?
- Immutable numbered revisions, mutable drafts, or both?
- Does each version snapshot materials, tools, steps, media references, and extension data?
- What is the publish workflow for a new version (draft -> published -> superseded)?
- How should old versions be exposed in API and UI (full history vs selected milestones)?
**Q2 [P1]** How are materials modelled?
- Free-text only, or linked to a shared taxonomy/catalog?
- Do quantities and units need to be structured (for search/filtering), or is prose sufficient for MVP?
**Q2a [P1]** How are required tools modelled?
- Free-text tools list only, or a normalized/tool taxonomy strategy?
- Should tools support optional metadata (skill level, safety notes, alternatives)?
**Q2b [P1]** How are canonical external links modelled and validated?
- Single canonical link or multiple typed links (homepage, repository, video, reference)?
- Are external links version-scoped or project-scoped?
- What URL validation and safety checks are required?
**Q2c [P1/P2]** What is the extension model for domain-specific project data?
- What extension shape is supported first: typed JSON blocks, namespaced key/value fields, or plugin-defined schemas?
- How are extensions identified and namespaced to avoid collisions across instances?
- How do API and UI clients discover which extensions are present on a project?
- Which extension fields are indexed for search, and which remain opaque?
- What validation guarantees does the server provide for extension payloads?
**Q3 [P1]** What is the tag and category strategy?
- Folksonomy (free-form user tags), curated taxonomy, or both?
- Is there a category hierarchy (craft type → technique → material)?
- Who controls the taxonomy on a given instance?
**Q4 [P1]** What licences can a project carry?
- Does the platform enforce a licence selection, or is it free-text?
- Does the platform's own CC BY-SA 4.0 licence apply to user content by default, or is that a separate question?
**Q5 [P3/P4]** What are the rules for project forks and remixes?
- Can a user fork another user's project (including from a remote instance)?
- Does a fork maintain a reference/attribution link to the original?
- What licence constraints apply to remixing?
---
## Identity and Authentication
**Q6 [P1]** What is the authentication mechanism?
- Local username/password (with secure hashing), passkeys, OAuth2/OIDC third-party providers, magic-link email, or a combination?
- Is email verification required for account creation?
**Q7 [P1]** What account fields are required beyond the ActivityPub actor minimum?
- Display name, bio, avatar, location — which are required, optional, or omitted entirely?
- Are there craft-specific profile fields (preferred crafts, skill level)?
**Q8 [P2]** Can remote ActivityPub actors interact without a local account?
- Can a remote actor follow a local user, like a project, or comment, without registering locally?
- What local data is persisted for a remote actor who interacts?
**Q9 [P1]** What is the session and token strategy for the API?
- Short-lived JWTs, opaque bearer tokens with a refresh flow, or server-side sessions?
- How are tokens invalidated (logout, account suspension)?
---
## API Design
**Q10 [P1]** What API style?
- REST with JSON, or something else (GraphQL, JSON:API)?
- Is there value in JSON:API's sparse fieldsets and relationship includes for a project-browsing use case?
**Q11 [P1]** What is the API versioning strategy?
- URL path prefix (`/api/v1/`), `Accept` header versioning, or no versioning until a breaking change forces it?
**Q12 [P1]** What is the pagination strategy?
- Cursor-based (stable under concurrent inserts), offset-based, or keyset pagination?
- What is the default and maximum page size?
**Q13 [P1]** What rate-limiting and abuse-prevention strategy applies to the API?
- Per-IP, per-authenticated-user, or both?
- Does this apply equally to federation endpoints?
---
## Front-End / Bundled UI
**Q14 [P1]** What technology powers the bundled web UI?
- Vanilla JS / progressive enhancement (HTMX or similar), a Rust WASM front-end framework, or a JS framework (e.g. Svelte, Vue)?
- Does the choice live in the same Cargo workspace or a separate directory with its own build tooling?
**Q15 [P1]** What is the no-JavaScript fallback scope?
- Read-only browsing (project pages, search results) without JS is desirable.
- Authoring without JS is probably out of scope — is that an explicit decision?
**Q16 [P1]** What is the accessibility baseline and implementation strategy?
- WCAG 2.1 AA is the minimum standard. How do we validate compliance and maintain it over time?
- What alt text strategy is required for all media (project images, step-by-step photos, diagrams, charts)? Do we enforce it at upload time or provide tools for post-hoc addition?
- What captions and transcripts strategy for audio/video content? Who is responsible for creation?
- What text-to-speech capabilities should be built into the API and UI? Should the API provide structured step data (title, description, images) in a format suitable for TTS clients?
- How do we support blind and low-vision users in the bundled UI? Screen reader testing and semantic HTML are baseline; what else?
- How do we support Deaf and hard-of-hearing users? Captions for instructional videos, visual indicators for audio cues, readable transcripts.
- How do we support users with motor differences? Full keyboard navigation, no time-dependent interactions, adjustable interface sizes/spacing.
- What color contrast requirements apply to project images and diagrams submitted by users? Can we provide tools or guidance to improve accessibility?
- Are there craft-community-specific accessibility concerns (e.g., tactile or kinesthetic learning for certain crafts, accessible alternatives for physical demonstrations)?
- What is the accessibility review process for features and content? How are accessibility issues prioritized?
**Q16a [P1]** What reading-comfort customization options does the bundled UI provide?
- What font choices are offered? At minimum, the UI should offer a dyslexia-friendly typeface (e.g., OpenDyslexic, Atkinson Hyperlegible) alongside a standard option.
- Should the full base font be user-overridable (including via browser/OS settings and user stylesheets)? The bundled UI must not block this.
- What line-height, letter-spacing, word-spacing, and paragraph-width adjustments are available? (Wide spacing and shorter line length aid many readers.)
- Are text size controls per-user and persistent (stored in account preferences), or session-only?
- Should there be a low-visual-stress color theme (off-white/cream backgrounds, reduced contrast)? Some users with dyslexia or visual stress find high-contrast black-on-white harder to read.
- How do user-defined display preferences interact with instance theming? User preferences must take precedence.
- Are reading-comfort settings exposed via API so third-party clients can retrieve and honor them?
**Q16b [P1]** What is the localization (i18n/l10n) strategy?
- What locale data format is used for UI strings? (gettext PO, JSON, TOML, ICU MessageFormat?)
- What is the language tagging convention for user-generated content? BCP 47 language tags are assumed — is this confirmed?
- Does the platform support RTL scripts (Arabic, Hebrew, Persian, etc.) from the first UI pass? CSS logical properties are required if so.
- What locale-sensitive formatting is required at MVP? (dates, times, numbers, units of measure)
- How are translations contributed by the community? Are translation files in the main repo or a separate project?
- What is the fallback behavior when a user's preferred locale is not available for a piece of content?
- Does the search index need locale-specific text analysis (stemming, tokenization) configured per language?
- Does the ActivityPub object for a project carry `@language` or equivalent metadata to signal content language to federated instances?
- Is machine translation (MT) in scope? If so, is it instance-opt-in, per-user opt-in, or always-on?
---
## Federation and ActivityPub
**Q17 [P2]** How do FeDIY project objects map to ActivityPub types?
- Use `Note` or `Article` for broad compatibility, or define a custom `FeDIYProject` type?
- If custom, what fallback representation do we provide for clients that don't understand it?
**Q18 [P2]** What ActivityPub activity types does FeDIY support in phase 2?
- Minimum set: `Create`, `Update`, `Delete`, `Follow`, `Accept`, `Reject`, `Undo`.
- Phase 2 or later: `Like`, `Announce` (boost), `Flag` (report)?
**Q19 [P2]** What is the media federation strategy?
- Do media attachments (images, files) federate as links to the canonical origin, or are they replicated locally?
- How are broken/unavailable remote media handled in the UI?
**Q20 [P2]** What is the HTTP Signatures key lifecycle?
- Per-actor keypairs (standard), or instance-level signing with `keyId` delegation?
- Key rotation: when and how are keys rotated, and how are remote instances notified?
**Q21 [P2]** Which well-known endpoints are in scope for phase 2?
- WebFinger (required for actor discovery).
- NodeInfo (instance metadata for compatibility and listing services).
- `/.well-known/oauth-authorization-server` if OIDC is supported.
**Q22 [P2]** How is WebFinger structured for FeDIY entities?
- Actors are users: `acct:user@instance` — standard.
- Do projects also have addressable AP identities, or do they belong to the author actor?
---
## Search and Discovery
**Q23 [P1]** What is the full-text search implementation?
- PostgreSQL full-text search (zero extra infra), an embedded engine (Tantivy via Rust), or an external service (Meilisearch, Elasticsearch)?
- What fields are indexed: title, description, steps, tags, materials?
**Q24 [P2]** Does search span federated content?
- Phase 1: local content only.
- Phase 2+: do we query remote instances, or build a local index of federated objects we've received?
---
## Media Storage
**Q25 [P1]** How are user-uploaded media assets stored?
- Local filesystem, S3-compatible object storage, or both with a configurable backend?
- What is the maximum file size and permitted formats for MVP?
**Q26 [P1]** Is image processing (resize, thumbnail, format conversion) in-process or delegated?
- In-process with a Rust image library, or delegated to an external service/worker?
---
## Persistence
**Q27 [P0/P1]** What is the database?
- PostgreSQL is the obvious choice given full-text search, JSONB for flexible AP object storage, and broad hosting support — is this confirmed?
- Is SQLite a supported option for lightweight self-hosting, or is that complexity not worth it?
**Q28 [P1]** What is the database migration strategy?
- A Rust migration library (sqlx migrate, refinery), or a standalone tool (Flyway, Liquibase)?
---
## Moderation
**Q29 [P1]** How are moderator roles assigned and scoped?
- Instance admin assigns moderators manually; no self-service promotion.
- Are there multiple moderation tiers (e.g. content moderator vs instance admin)?
**Q29a [P1]** What user-level personal moderation tools are provided, and how are they represented in the data model?
- A user must be able to block specific local and remote accounts without any moderator involvement. What is the API shape for a user-level block?
- A user must be able to block an entire remote instance. Does this map to an ActivityPub `Block` activity, a local filter record, or both?
- Are user-level mutes (suppress content without blocking) distinct from blocks in the data model?
- What keyword and wildcard filter capabilities are supported? Are filters applied server-side (content never delivered to client) or client-side (UI suppresses matching content)? Both options should be supportable.
- When a user blocks another, is the blocked party notified? (Convention in AP-based platforms is not to notify.)
- Do user-level blocks prevent the blocked party from seeing the user's public content, or only prevent interaction?
- How are user-level moderation actions represented in the user's own data export (GDPR portability)?
**Q29b [P1/P2]** What is the data model and AP representation for shareable block and recommendation lists?
- Shareable block lists: what format is used for export and import? JSON-LD? A well-known community format (e.g., Oliphant CSV, FediBlock)?
- Can a block list be published as a live federated resource that subscribers can follow and receive updates from?
- Recommendation lists: are these a first-class AP object type (e.g., `OrderedCollection` of `Project` objects), or a local-only feature?
- Personal collections/bookmarks: are these private by default with an explicit publish action, or public by default?
- What privacy model applies to list subscriptions? Can a user see who has subscribed to their block list?
- How does subscribing to another user's block list interact with the subscriber's own moderation decisions? (Additive by default; subscriber retains override.)
- Are community-maintained lists (collaboratively edited by a group) in scope, and if so, what is the authorship/edit governance model?
- Can lists be versioned or snapshotted so a subscriber can audit what changed between updates?
**Q30 [P3]** Do we participate in shared blocklists (e.g. FIRES, Oliphant tiers)?
- Subscribe to external block lists automatically, or manual import only?
- Is this a phase 3 concern or deferred entirely?
- Does the user-level list subscription mechanism (Q29b) subsume this, or is it a separate instance-level concern?
**Q31 [P3]** What is the user-facing report and appeals workflow?
- Can a user see the status of their own report?
- Is there a formal appeals process, or at moderator discretion?
---
## Deployment and Operations
**Q32 [P0/P1]** What is the primary deployment target?
- Single static binary + external PostgreSQL (simplest self-hosting).
- OCI/Docker container image.
- NixOS module.
- All three, or a prioritised subset?
**Q33 [P1]** What are the minimum self-hosting requirements?
- RAM, CPU, disk, and network minimums for a small instance.
- Is there a single-binary mode with embedded SQLite for hobbyist hosting (see Q27)?
**Q34 [P1]** What is the configuration strategy?
- Environment variables only, a config file (TOML), or both?
- What must be configurable per-instance (instance name, federation policy, storage backend, SMTP, etc.)?
**Q35 [P4]** What observability stack is expected?
- Structured logging to stdout (12-factor), Prometheus metrics endpoint, OpenTelemetry traces?
- Are these required at launch or added progressively?
---
## Content and Community Policy
**Q36 [P0]** Does the platform define baseline content guidelines beyond what moderation tooling enforces?
- What categories of content are prohibited regardless of instance policy?
- CSAM must be prohibited and reported — is there a mechanism planned?
**Q37 [P4]** Are collaborative/co-authorship workflows in scope?
- Can multiple accounts be listed as co-authors of a project?
- Is there a contribution workflow (pull-request style) or trust-based co-author invite?
---
## Privacy and Legal Compliance
**Q38 [P0/P1]** What personal data does FeDIY collect and what is the lawful basis for each category?
- Account registration data (email, display name, password hash): processed under contract. What is the minimum required set?
- IP address and access logs: is there a documented retention window and purge policy before Phase 1 launches?
- Content data (projects, drafts, media): processed under contract. Drafts are private; what is the data model distinction between draft and published that ensures privacy?
- Session and authentication tokens: what is the TTL and purge-on-logout policy?
- Does the platform ever use personal data for purposes beyond what is needed to operate the service (e.g. analytics, recommendations)? If so, is consent obtained?
**Q39 [P1]** What does the right-to-access (GDPR Article 15) export look like?
- What data categories are included in a user's full export: account profile, published projects, draft projects, interactions (follows, likes, bookmarks, block lists), moderation history affecting the user, session and audit records?
- What format is the export? JSON is required; is a human-readable HTML or PDF summary also provided?
- Is the export self-service (user-initiated via UI and API) or does it require admin action?
- What is the SLA for delivering an export? GDPR requires response within one month.
- How are exports of federated data handled — i.e. data about the user that exists on remote instances? The export must explain this limitation.
**Q40 [P1]** What does account deletion (GDPR Article 17 — right to erasure / right to be forgotten) entail?
- What is the deletion workflow? Must be self-service, not admin-gated.
- What data is fully erased: credentials, profile fields, private drafts, session tokens, email address, IP logs?
- What is tombstoned rather than erased: publicly published content that is referenced by others? The privacy notice must explain the tombstone policy.
- What moderation records are retained in anonymised form for legal/safety obligations, and for how long?
- How long after a deletion request is the data fully purged? A maximum window (e.g. 30 days) must be defined and disclosed.
- How is deletion propagated to federated instances (GDPR Art. 17(2))? An ActivityPub `Delete` activity is sent to known peers; remote compliance cannot be guaranteed. The privacy notice must say this.
- Is there a delivery-receipt log for `Delete` activities sent to federated peers, retained for operator accountability (separate from user personal data)?
- Is there a deletion-in-progress state visible to the user while federation propagation is completing?
- Does the architecture support deletion of individual published items (a single project, a comment) without requiring full account closure?
- Is legal hold (suspension of deletion during active investigation or legal proceedings) required? If so, how is the user notified?
- For financial/transaction records, what is the operator-configurable retention window to satisfy accounting law?
**Q40a [P1]** How does the platform support compliance with US state privacy laws alongside GDPR?
- CCPA/CPRA (California): deletion SLA is 45 days (extendable to 90). Does the system support configurable SLA windows per instance?
- CCPA/CPRA: right to opt-out of sale/sharing of personal data. FeDIY does not sell data, but does the architecture support the Global Privacy Control (GPC) signal as an opt-out mechanism?
- CCPA/CPRA: right to correct inaccurate personal information. Is this satisfied by standard profile editing?
- CCPA/CPRA: right to know (categories of data collected, sources, purposes, third parties). Is this satisfied by the privacy notice and data export?
- Virginia VCDPA, Colorado CPA, Connecticut CTDPA, Texas TDPSA and others: broadly equivalent deletion and portability rights with 45-day SLAs. Does the jurisdiction-agnostic deletion workflow satisfy all of these?
- Colorado CPA specifically: Global Privacy Control signals must be honored. Is GPC processing in scope for Phase 1?
- Brazil LGPD Article 18: right to deletion of unnecessary or unlawfully processed data, equivalent in scope to GDPR Art. 17. Is the architecture jurisdiction-agnostic enough to satisfy this without custom logic?
- Canada PIPEDA (and forthcoming CPPA / Bill C-27): limited right to deletion currently; architecture should anticipate the stronger rights in Bill C-27 when enacted. What future-proofing is needed?
- UK GDPR: equivalent to EU GDPR; same architecture satisfies it, but does the template privacy notice need UK-specific customisation (ICO contact details, UK law references)?
**Q40b [P1]** What is the operator guidance for jurisdiction-specific compliance customisation?
- Which compliance settings are configurable per-instance (SLA window, data categories, GPC handling, legal hold policy)?
- Does FeDIY provide a compliance checklist for operators launching a public instance?
- Does the template privacy notice include jurisdiction-specific variants (EU, UK, California, Brazil) or a single configurable template?
- How does FeDIY communicate clearly that the instance operator is the data controller and bears primary legal responsibility?
**Q41 [P1]** What are the data retention periods for each category?
- Authentication and failed-login logs: short retention (3090 days suggested); is this configurable by instance operators?
- IP address logs: are these stored at all beyond ephemeral request processing? If so, what is the maximum retention window?
- Deleted account data: what is the maximum time between deletion request and full purge of identifiable data?
- Moderation records: anonymised retention for legal purposes — what anonymisation process is applied and for how long are they kept?
- Inactive account data: is there a policy for purging accounts that have never been activated or have been dormant for an extended period?
**Q42 [P1]** What is the data portability (Article 20) export format, and can it be imported?
- Is the export format capable of round-tripping into another FeDIY instance (i.e. migrate your account and projects to a different instance)?
- Does the export include ActivityPub actor identity in a way that helps federated peers update their records after a migration?
- Is account migration (move actor from instance A to instance B, with follower redirect) in scope? If so, which phase?
**Q43 [P0/P1]** What is the privacy notice strategy for instance operators?
- FeDIY provides a template privacy notice that operators must customise. What is the minimum required content?
- Is the privacy notice served at a well-known URL (`/privacy`) before the instance accepts registrations?
- How does the platform ensure that operators have published a privacy notice? (e.g. configuration check at startup, NodeInfo metadata)
- Who is the data controller for a self-hosted instance? The instance operator. Is this clearly communicated in the code and documentation?
**Q44 [P1]** What is the children's privacy policy?
- What minimum age is required to register? (GDPR requires parental consent for under-16 in most EU member states; COPPA requires age 13 in the US.)
- Is age verification self-declaration only, or is a stronger mechanism required?
- What happens if a minor's account is reported? Is there a defined response process?
**Q45 [P2]** How are federated data subjects' rights handled?
- If a user on a remote instance requests erasure of data held locally (e.g. cached profile, received activities), what is the process?
- Does receiving a `Delete` activity for a remote actor trigger a purge of all locally cached data about that actor?
- Are there GDPR obligations that apply to data received via federation from instances in different jurisdictions?