Palantir Foundry Developer Handbook

A production-oriented handbook for engineers, FDEs, analytics leads, and application builders working across Foundry's data integration, transforms, Ontology, and operational application stack.

Official docs synthesis Code Repositories + Ontology + Workshop Enterprise architecture focus March 2026

This handbook follows the same mental model Palantir uses in the documentation: data first lands as datasets, is shaped into reliable pipelines, becomes business-native through the Ontology, and is then activated in analytics and operational applications.

Module 1: Foundry Architecture and Core Concepts Integration -> Transformation -> Ontology -> Apps

Platform philosophy, Compass, Projects, folders, RIDs, data branching, and Data Lineage.

Module 2: Data Integration and Ingestion Data Connection

How Foundry connects to JDBC systems, APIs, file stores, and operational systems, then lands raw data as governed datasets.

Module 3: Data Transformation in Code Repositories transforms.api

Foundry-native Python transforms, incremental pipelines, and when to choose Code Repositories over Pipeline Builder.

Module 4: The Ontology Objects + Links + Actions

Why Foundry models the business as an operational graph instead of leaving teams with raw tables and joins.

Module 5: Analytics and Operational Applications Code Workspaces, Contour, Quiver, Workshop

How analysts, data scientists, and operational teams consume and act on data without breaking governance.

Module 6: Security and Governance Markings + Lineage

Mandatory controls, project roles, policy enforcement, and how security propagates with data rather than relying on ad hoc dashboards.

Module 7: Real-World Use Cases Supply Chain + AML

End-to-end scenarios tying ingestion, transforms, Ontology, ML, analytics, and actions into operational systems.

Module 1: Foundry Architecture and Core Concepts

The Platform Philosophy

Foundry is not trying to be a prettier data lake. It is trying to close the gap between data engineering and operations. The core journey is: Integration -> Transformation -> Ontology -> Applications. Each step adds more structure, more accountability, and more operational usefulness.

External systems

Raw datasets

Curated pipelines

Ontology objects

Workshop / Quiver / APIs

A useful analogy is an industrial refinery. Raw crude oil is valuable, but nobody wants to run a logistics operation directly on crude oil. You refine it into diesel, jet fuel, and lubricants with known quality and governance. Foundry does the same for enterprise data: raw ERP extracts and API payloads are refined into reusable datasets, then elevated into business-native entities like Factory, Part, Shipment, or Transaction.

Why Foundry feels different from a traditional data stack: in a warehouse-centric stack, the end state is often a table and a BI dashboard. In Foundry, the end state is meant to be an operational capability: a governed object model, reusable logic, and applications that can safely write decisions back into the system.

Integration

Foundry connects to source systems and lands data as datasets with transactions, permissions, lineage, and scheduling built in.

Transformation

Pipelines in Code Repositories or Pipeline Builder convert raw data into curated, testable, governed datasets.

Ontology

The semantic and operational layer that maps datasets into objects, properties, links, actions, and functions.

Applications

Workshop, Quiver, Contour, Code Workspaces, and the OSDK allow teams to analyze, decide, and act on the same governed substrate.

Compass and the Filesystem

Compass is the shared file-and-resource layer of Foundry. Public documentation describes Projects and resources as the basic building blocks of the platform. A Project is the collaboration boundary. A resource is the thing inside it: dataset, repository, analysis, application, report, or other artifact.

Projects are the main collaboration and security boundary. They carry project-level roles, discovery settings, and inherited access controls.
Folders are organizational structure inside a project. They keep a project navigable but do not replace the project as the primary governance boundary.
Resources are the actual working assets. In Foundry terms, a dataset is a resource, a code repository is a resource, and a Workshop application is a resource.
RIDs are globally unique resource identifiers. They matter because names and folder locations can change, while the RID remains the canonical machine identity.

Think of a Project like a secured building, folders like rooms, and resources like the equipment in those rooms. The label on the door may change, but the serial number engraved on the machine stays fixed. That serial number is the RID.

Foundry uses the term resource instead of file because many resources, such as datasets and repositories, contain internal files of their own.

Concept	What it is	Why it matters
Project	Primary collaboration and permission boundary	Standardizes access, ownership, and discoverability for related work
Folder	Organizational container inside a Project	Keeps complex delivery programs navigable
Resource	Dataset, repo, dashboard, app, report, or other asset	Common security, metadata, comments, sharing, and auditing model
RID	Stable unique identifier for a resource	Decouples references from fragile human-readable paths

Datasets themselves are wrappers around files stored in a backing filesystem, often cloud object storage. The value of the dataset abstraction is not the bytes alone. It is the managed metadata around them: schema, transactions, branches, permissions, lineage, and build semantics.

Branching and Data Lineage

This is where Foundry usually clicks for engineering teams. Most platforms version code. Foundry versions code and data together. Public documentation explicitly describes dataset transactions and dataset branches as the basis of Foundry's "Git for data" behavior.

Why this matters operationally: you are not only testing new transformation logic on a branch. You are testing the resulting data products on a branch too. That means you can inspect branch-specific outputs before they affect downstream consumers.

Dataset transactions are atomic changes to a dataset's contents. Foundry supports SNAPSHOT, APPEND, UPDATE, and DELETE transactions.
Dataset branches are pointers to transaction histories on a dataset. They are conceptually similar to Git branches, but the docs note that dataset branches themselves are not merged like Git branches.
Build branches tie Git-like code branching to dataset branching. When you build on a feature branch, Foundry compiles the branch's JobSpecs and writes outputs only to that branch.
Fallback chains let a branch read branch-local logic or data where present and fall back to master where nothing changed.

Traditional data stacks usually force teams into one of two bad choices: either test against production-like data outside the main pipeline, or run risky changes directly in shared production tables. Foundry's branch-aware build system provides a third option: a safe rehearsal environment where both transformation logic and downstream datasets can evolve together.

Code branch:        feature/late-shipments
Dataset branches:   raw_orders@master, curated_orders@feature, alerts@feature
Build fallback:     feature -> master

Result:
- unchanged upstream inputs can still be read from master
- changed transforms publish branch-specific JobSpecs
- changed outputs materialize only on the feature branch
- downstream users on master see no disruption

Data Lineage is the explainer surface for all of this. The docs describe it as an interactive tool for holistically viewing how data flows through the platform. In practice, it is the control tower for understanding:

where a dataset came from,
which transforms produced it,
which downstream datasets and applications depend on it,
what code generated it,
whether it is stale or out of date, and
what security requirements were inherited along the way.

The best analogy is software dependency tracing plus change impact analysis, but for data products. Instead of asking "What services call this API?", you ask "If I change this transform or this upstream source, which datasets, object types, dashboards, and Workshop modules become affected?"

Traditional stack problem	Foundry answer	Enterprise value
Opaque ETL jobs	Lineage graph across datasets, code, and builds	Faster root-cause analysis and safer change review
Shared production tables make experimentation risky	Code branches plus dataset branches	Parallel development without corrupting shared outputs
Security reviews happen after data is copied around	Lineage-aware security inheritance	Access control moves with the data automatically

Important nuance: dataset branches are not Git branches in every respect. The docs explicitly note that datasets do not support dataset-branch merging the way Git does. Foundry instead uses the build system and code review process to move tested logic into the main branch and then materialize new canonical data there.

Module 2: Data Integration and Ingestion

Data Connection

Foundry's public documentation frames ingestion around the Data Connection framework. That framework is designed to manage source connections over time using dataset transactions, branching, granular security, and synchronized metadata. In field language, practitioners often refer to the underlying connector and orchestration pattern as Magritte and source agents; in the product docs, the supported surface is the Data Connection application and its source-specific sync capabilities.

At a practical level, Data Connection gives Foundry a standard contract for pulling from very different sources:

JDBC / Warehouses

ERP databases, SQL Server, PostgreSQL, Oracle, Snowflake, and similar sources are pulled into Foundry as datasets or virtualized for downstream use.

REST APIs

External systems can be synced or exported to using Data Connection and external transforms, especially for scheduled pull or push workflows.

Object Stores and Files

S3 and other file-based sources map naturally onto dataset transactions such as snapshot mirrors or append-only mirrors.

A good analogy is a managed loading dock for the enterprise. The loading dock does not care whether the incoming goods arrived by truck, ship, or rail. It standardizes intake, manifests, timestamps, security checks, and hand-off into the warehouse. Data Connection plays that role for source systems.

Syncs and Schedules

Foundry distinguishes between getting data in once and getting data in reliably over time. That sounds obvious, but it is where many data platforms quietly fail. A one-off extract is not a data product. A repeatable sync with lineage, permissions, and clear transaction semantics is.

Pattern	How it works	When to use it
Direct or manual ingestion	Upload or one-time import into a dataset resource	Bootstrapping, prototypes, ad hoc investigations
Scheduled sync	Recurring ingestion that lands new dataset transactions over time	Operational production pipelines
Virtual access	Expose source data without full replication in some cases	Latency-sensitive or governance-constrained access patterns
External transforms	Code-driven scheduled API interaction using Code Repositories	Custom REST ingestion and outbound integration workflows

What lands in the Foundry filesystem is usually a raw dataset. That raw dataset is intentionally close to source reality. You do not want business logic hidden in the loading step. The source system should remain auditable, and the refinement should happen downstream in transformations.

Why Foundry insists on explicit landing zones: separating raw ingestion from curated transformation makes incident response, replay, backfills, and governance substantially easier. When a downstream KPI breaks, teams can verify whether the issue came from source extraction, transformation logic, or ontology mapping.

Transaction-Aware Ingestion

The public dataset documentation is explicit that ingestion style matters because it determines downstream pipeline behavior:

SNAPSHOT means the sync replaces the full current view. Simple, but expensive at scale.
APPEND means only new files are added. This is the foundation for performant incremental pipelines.
UPDATE means new files may arrive and old files may be overwritten. Useful when the source mutates records, but it breaks append-only assumptions.
DELETE supports retention and controlled removal from the current dataset view.

If you are building a serious production pipeline, the ingestion mode is not just a connector setting. It is an architectural decision that determines cost profile, latency, and whether downstream pipelines can remain incremental.

Module 3: Data Transformation in Code Repositories

The Python Transforms API

Foundry's transforms.api is the contract between your Python code and Foundry's build system. This is what turns a PySpark function into a governed pipeline step with declared inputs, outputs, lineage, checks, preview support, branching, and scheduling.

That declaration step is the important difference from generic PySpark. In open Spark, you can read anything and write anywhere as long as the cluster permits it. In Foundry, you declare the data contract up front so the platform can reason about lineage, impact, permissions, and builds.

Copy-Pasteable PySpark Transform

from pyspark.sql import functions as F
from transforms.api import transform_df, Input, Output


@transform_df(
    Output("/Acme/SupplyChain/curated/factory_part_demand"),
    shipments=Input("/Acme/SupplyChain/raw/erp_shipments"),
    part_master=Input("/Acme/SupplyChain/master/parts"),
)
def compute(shipments, part_master):
    return (
        shipments
        .join(part_master, on="part_id", how="left")
        .withColumn("required_date", F.to_date("required_timestamp"))
        .withColumn("late_flag", F.col("promised_timestamp") > F.col("required_timestamp"))
        .withColumn("open_value_usd", F.round(F.col("quantity") * F.col("unit_cost_usd"), 2))
        .select(
            "shipment_id",
            "factory_id",
            "part_id",
            "part_description",
            "required_date",
            "quantity",
            "open_value_usd",
            "late_flag",
        )
    )

This snippet uses the exact Foundry wrapper style requested: from transforms.api import transform_df, Input, Output. The platform knows:

which datasets are read,
which dataset is produced,
how to render the node in lineage,
what changes a pull request may impact, and
which branch-specific output should be built during development.

Why Incremental Processing Matters

Official Foundry documentation is direct here: incremental pipelines avoid recomputing unchanged data and are often necessary when input scale is high. If your transaction logs are growing by millions of rows per day, full snapshots are a tax you will keep paying forever.

The analogy is simple. A nightly batch rebuild is like recalculating every bank account in the country because one customer made a deposit. Incremental processing instead says: process the new deposit, update the affected state, and move on.

Incremental PySpark Transform

from pyspark.sql import functions as F
from transforms.api import (
    incremental,
    transform,
    Input,
    Output,
    IncrementalTransformInput,
)


@incremental()
@transform(
    risk_scores=Output("/Acme/AML/curated/transaction_risk_scores"),
    transactions=Input("/Acme/AML/raw/daily_transactions"),
    customers=Input("/Acme/AML/master/customers"),
)
def compute(ctx, risk_scores, transactions: IncrementalTransformInput, customers):
    new_transactions = transactions.dataframe("added")

    if ctx.is_incremental and new_transactions.rdd.isEmpty():
        return

    customer_df = customers.dataframe()

    scored = (
        new_transactions
        .join(customer_df, on="customer_id", how="left")
        .withColumn(
            "risk_score",
            F.when(F.col("amount_usd") >= 10000, F.lit(0.60)).otherwise(F.lit(0.10))
            + F.when(F.col("high_risk_country") == F.lit(True), F.lit(0.25)).otherwise(F.lit(0.00))
            + F.when(F.col("pep_flag") == F.lit(True), F.lit(0.15)).otherwise(F.lit(0.00))
        )
        .withColumn(
            "risk_bucket",
            F.when(F.col("risk_score") >= 0.75, F.lit("HIGH")).otherwise(F.lit("STANDARD"))
        )
        .select(
            "transaction_id",
            "customer_id",
            "booking_date",
            "amount_usd",
            "risk_score",
            "risk_bucket",
        )
    )

    risk_scores.write_dataframe(scored)

This example uses IncrementalTransformInput directly and reads only the added window from the transactions input, which is the exact capability documented in the API reference. That is what keeps the transform proportional to new data instead of proportional to total historical data.

Architectural warning: incremental pipelines are powerful but more fragile. Foundry's docs explicitly note that you must understand transaction behavior and be resilient to upstream SNAPSHOT or UPDATE events. Use incremental logic when the volume justifies the added complexity.

Code Repositories vs Pipeline Builder

Both are first-class. The wrong move is treating one as "for engineers" and the other as "for non-engineers." The real decision is about control, complexity, and maintainability.

Choose this	Best for	Trade-off
Code Repositories	Complex business logic, PySpark, custom libraries, tests, code review, reusable engineering standards	Higher engineering overhead, slower for simple mappings
Pipeline Builder	Fast delivery, visual composition, common joins/filters, streaming and batch patterns, lower-code delivery	Less expressive for specialized logic or heavy software-engineering workflows

A useful heuristic:

Use Pipeline Builder when the data flow is legible as a pipeline diagram and your transformations are primarily declarative.
Use Code Repositories when you need software-engineering discipline: abstractions, libraries, advanced logic, unit tests, branch review, or specialized Spark behavior.

Foundry-specific value: whichever authoring surface you choose, the platform still gives you the same underlying benefits: builds, branch-aware execution, lineage, health checks, scheduling, and permission-aware data products.

Module 4: The Ontology

Objects, Links, and Properties

The Ontology is the heart of Foundry because it changes the question from "What tables do we have?" to "What parts of the business are we representing, and how do they relate?" Public documentation describes the Ontology as an operational layer sitting on top of datasets, models, and other digital assets, connecting them to their real-world counterparts.

Object Type

A business entity such as Factory, Part, Supplier, Customer, Case, or Transaction.

Property

A field on that entity: status, amount, capacity, priority, owner, or date.

Link Type

A typed relationship between objects: Factory consumes Part, Customer owns Account, Transaction belongs to Case.

The conceptual shift is from row-oriented thinking to domain-oriented thinking. In a warehouse, a user may need to remember that fact_shipments.factory_id = dim_factory.id. In the Ontology, a user works with a Factory object that already knows its related Shipment, Supplier, or Part objects. The join logic becomes part of the platform's semantic contract rather than tribal knowledge in SQL.

If datasets are the refined fuel, the Ontology is the engine block. It gives the business a machine-readable representation of how the organization actually works.

Why Foundry Pushes the Ontology So Hard

Reuse: multiple applications can rely on the same semantic layer rather than each dashboard re-implementing business joins and definitions.
Operational consistency: the same object and action definitions can be exposed in Workshop, Quiver, SDK-driven apps, and search.
Governance: security and change control apply at the same layer where business users actually work.
Decision capture: the platform does not stop at analytics. Actions and functions let users change operational state in a governed way.

Actions and Functions

Foundry distinguishes between action types and functions. Action types are the user-facing, governed transaction surface. Functions are server-side logic units that can compute values, return object sets, or generate Ontology edits. When an edit function is wired into an action type, users can safely write decisions back into the Ontology.

That is why Foundry applications are more than dashboards. A planner can change the state of a shipment. An investigator can escalate a transaction. An operations lead can assign ownership. The logic is centralized, audited, and permissioned.

Function-Backed Ontology Action Example

import { Shipment } from "@ontology/sdk";
import { Client, Osdk } from "@osdk/client";
import { createEditBatch, Edits } from "@osdk/functions";

type ShipmentEdit = Edits.Object<Shipment>;

export default function requestExpedite(
    client: Client,
    shipment: Osdk.Instance<Shipment>,
    requestedBy: string,
    reason: string,
): ShipmentEdit[] {
    const batch = createEditBatch<ShipmentEdit>(client);

    batch.update(shipment, {
        status: "EXPEDITE_REQUESTED",
        expediteRequestedBy: requestedBy,
        expediteReason: reason,
        expediteRequestedAt: new Date().toISOString(),
    });

    return batch.getEdits();
}

This snippet follows the official TypeScript v2 functions pattern for Ontology edits: define an Edits type, create an edit batch with createEditBatch, update the object, and return the edits. Per the docs, the edits only take effect when the function is configured as a function-backed action.

Where the security comes from: not from burying checks in frontend code. The action type can enforce who may run it, what parameters are required, what submission criteria apply, and what side effects or validations should happen. That is the enterprise advantage over hand-built CRUD screens.

Read-Oriented Python Function Example

from functions.api import function
from ontology_sdk import FoundryClient
from ontology_sdk.ontology.objects import Transaction
from ontology_sdk.ontology.object_sets import TransactionObjectSet


@function
def high_risk_transactions(min_score: float) -> TransactionObjectSet:
    client = FoundryClient()
    return client.ontology.objects.Transaction.where(
        Transaction.object_type.riskScore >= min_score
    )

Use read-oriented functions like this when you need server-side logic for Workshop, Quiver, or other operational interfaces. Use edit-returning functions when the workflow needs governed writeback.

Module 5: Analytics and Operational Applications

Code Workspaces

Code Workspaces gives users managed JupyterLab, RStudio, and VS Code environments inside Foundry. The public docs emphasize that these workspaces inherit Foundry's security, permissions, branching, scheduling, and repository infrastructure.

For data scientists, the value is straightforward: work in a familiar notebook or IDE, but on the same governed datasets and object model as the rest of the platform. No shadow copy. No side channel. No unsecured export just to train a model.

Why this matters: in many organizations, notebooks become a compliance escape hatch. Code Workspaces turns them into a first-class governed surface backed by Code Repositories and platform security.

Use Code Workspaces for model exploration, feature engineering, evaluation, and research workflows.
Use Code Repositories or Pipeline Builder for large-scale production transforms, because the docs explicitly note that Code Workspaces is single-node while the other tools leverage Spark-oriented infrastructure.
Use the same repository and branching discipline to move successful exploratory work into production pipelines.

Contour and Quiver

Contour and Quiver are both analysis tools, but they sit on different mental models.

Tool	Best mental model	Best use case
Contour	Table-centric, point-and-click analysis at scale	Large tabular analysis, dataset derivation, low-code transformations, dashboards over tables
Quiver	Object-aware and time-series-aware analytics	Ontology-driven analysis, linked-object exploration, operational dashboards, time-series workflows

A simple analogy: Contour is closer to a governed, scalable spreadsheet-plus-query environment for tables. Quiver is closer to an operational analytics canvas where objects and signals are native citizens.

Use Contour when your data is still mostly tabular, some data is not yet mapped into the Ontology, or you need large joins and transformations without writing code.
Use Quiver when your data is mapped in the Ontology, relationships matter, time series matters, and the result should plug directly into operational applications.

Workshop

Workshop is Foundry's operational application builder. The docs describe it as a flexible, object-oriented application building tool that uses Ontology objects, links, actions, and functions as first-class building blocks. That is the key distinction: it is not merely a dashboard builder.

Think of Workshop as the last mile between a governed digital twin and the humans who need to operate the business. A CRM, an alert triage desk, a maintenance queue, a parts shortage cockpit, and a fraud review inbox are all natural Workshop workloads.

What Workshop gives you

Layouts, widgets, events, object-aware views, actions, and function-backed logic, all aligned to the Ontology and the platform design system.

What it saves you from

Rebuilding auth, access control, search, writeback rules, object joins, and frontend infrastructure for every operational app.

Why Foundry does this differently: in a traditional stack, the BI layer is read-only and the operational app is a separate engineering program. Foundry tries to collapse that gap so that analytics and operations run on the same semantic and governance substrate.

Module 6: Security and Governance

Markings and Mandatory Controls

Foundry's security model is built around the idea that access control should travel with the data, not be bolted onto the final dashboard. The docs frame this as a combination of mandatory controls and discretionary controls.

Mandatory controls include Organizations and Markings. If a user does not meet them, roles do not help.
Discretionary controls are roles like Owner, Editor, Viewer, and Discoverer on Projects and resources.
Markings are conjunctive. A user must satisfy all applied markings to access the resource.
Project roles determine what a user can do once they are allowed through the mandatory gate.

The public docs are especially clear that markings inherit both through the file hierarchy and through data dependencies. That means a sensitive upstream dataset can automatically impose additional data requirements on downstream derivatives.

Raw PII dataset

Curated customer dataset

Ontology object

Workshop app

If the raw dataset carries a PII marking, that constraint propagates unless it is deliberately and correctly removed as part of an approved transformation stage. This is exactly why compliance teams like the platform. You do not have to hope every downstream analyst remembered the sensitivity level. The platform enforces it.

CBAC and PBAC in Practice

In customer conversations you will often hear CBAC and PBAC. Foundry's public docs emphasize markings, organizations, roles, restricted views, and additional data requirements more than those acronyms, but the enterprise interpretation is usually:

Model	How to think about it in Foundry	Typical implementation surface
CBAC	Classification-based access control. Access depends on the sensitivity classification attached to data.	Markings, organizations, inherited data requirements, project boundaries
PBAC	Purpose-based access control. Access is constrained to approved workflows and legitimate business purpose.	Project roles, application-specific access, action permissions, functions, restricted views, policy-driven workflow design

In other words, CBAC answers "What classification is this data?" PBAC answers "Even if I can see it, what am I allowed to do with it in this workflow?" Foundry's advantage is that both questions can be enforced inside the same lineage-aware platform.

Enterprise value: most data platforms secure storage, then leave downstream applications to reinvent authorization. Foundry extends security into datasets, ontology objects, actions, functions, and applications so governance does not disappear at the exact moment humans start making decisions.

Automatic Propagation Through Lineage

The docs explicitly state that restricting access to a dataset restricts access to downstream derived data because markings inherit along data dependencies. That propagation is one of the most important reasons Foundry commands enterprise budget:

It reduces accidental oversharing of sensitive derivatives.
It makes data lineage a live control surface for security review.
It allows teams to reason about the impact of expanding or removing access.
It makes downstream application builders inherit governance rather than recreate it badly.

Security review becomes much more legible too. The access checker and Data Lineage views let teams inspect not only whether a user has access to a resource, but whether they meet additional data requirements inherited from lineage.

Module 7: Real-World Use Cases

Scenario 1: Supply Chain Command Center

Goal: ingest ERP and logistics data, produce operational Factory and Part objects, and let planners trigger an Expedite Shipment action from a Workshop application.

ERP / WMS / Supplier feeds

Raw datasets

PySpark transforms

Factory + Part objects

Workshop command center

How the pieces combine

Integration: Data Connection ingests ERP purchase orders, inventory balances, supplier confirmations, and shipment events from databases, S3 drops, or APIs.
Transformation: Code Repositories standardize plant IDs, deduplicate part masters, calculate shortage risk, and produce curated datasets for supply planning.
Ontology: Curated datasets are mapped into Factory, Part, Supplier, and Shipment objects with links like Factory consumes Part and Supplier ships Shipment.
Applications: Workshop shows shortages, at-risk shipments, and planner work queues. Quiver charts lead-time deterioration over time. Actions capture interventions.

Why Foundry is strong here

In a conventional stack, the command center is often a fragile front-end project sitting on top of replicated warehouse views and custom service endpoints. In Foundry, the app can directly use object-aware search, links, actions, and security on top of the Ontology.

Representative action

A Workshop button calls a function-backed action similar to the requestExpedite example above. That action can require planner permissions, enforce that only at-risk shipments are eligible, and write the decision into the shipment object's writeback dataset so the whole organization sees the new state.

Scenario 2: Anti-Money Laundering Alerting

Goal: process daily transaction logs incrementally, score transactions in a model workflow, and surface high-risk Transaction objects in an investigator inbox.

Core banking logs

Append-only raw transactions

Incremental risk transform

Model scoring

Transaction objects + inbox app

How the pieces combine

Integration: daily or near-real-time transaction files land as append-oriented datasets so the pipeline can stay incremental.
Transformation: Foundry incremental transforms process only new transactions, enrich them with customer and sanctions context, and calculate base risk features.
Code Workspaces: investigators and data scientists use JupyterLab or RStudio in Code Workspaces to train and validate models on the same governed data foundation.
Ontology: scored records become Transaction, Account, Customer, and Case objects, linked for search and triage.
Applications: Workshop provides an inbox for high-risk transactions, Quiver provides trend and time-series views, and actions let investigators escalate, dismiss, or open cases.

Where governance matters most

AML is a textbook case for lineage-aware security. Case data may require investigation-specific markings so one case team cannot casually inspect another case's evidence. The documentation's markings examples explicitly highlight case-based access control as a strong use case.

Representative model workflow

1. Land daily transactions as APPEND dataset transactions.
2. Run an incremental transform to derive features only for newly added records.
3. Score the new feature set from Code Workspaces or a model integration workflow.
4. Publish high-risk records to a curated dataset and map them into Transaction objects.
5. Surface those objects in a Workshop inbox.
6. Let investigators trigger actions such as Open Case, Escalate, or Dismiss with Reason.

The result is not just "a fraud dashboard." It is a governed operational system that unifies ingestion, scoring, review, and decision capture.

Foundry Patterns to Remember

Raw first, semantics later

Land data close to source truth, curate in transforms, then map to the Ontology. Do not hide business logic inside ingestion.

Branch both logic and outputs

Use Foundry branching to test not only code changes, but the resulting data products and downstream impact safely.

Use the Ontology for reuse

If three dashboards need the same business entity, model it once as an object type rather than re-implementing joins three times.

Security travels with data

Markings and lineage inheritance are not bureaucracy. They are what make it possible to operationalize sensitive data without constant manual policing.

Official Docs Used

Overview Foundry documentation home
Projects Projects and resources
Datasets Datasets and transactions
Branching Branching
Lineage Data Lineage
Integration Data connectivity and integration
Connect Connecting to data
Pipelines Building pipelines
Transforms transforms.api
Transforms Python transforms getting started
Incremental Incremental transforms
Ontology Ontology overview
Actions Action types overview
Functions Functions overview
TS edits TypeScript v2 ontology edits
Py objects Python functions on objects
Code WS Code Workspaces
Analytics Analytics overview
Contour Contour overview
Quiver Quiver overview
Workshop Workshop overview
Security Security and governance
Glossary Security glossary
Markings Markings
Access Checking permissions

Palantir Foundry Developer Handbook

Table of Contents

Module 1: Foundry Architecture and Core Concepts

The Platform Philosophy

Compass and the Filesystem

Branching and Data Lineage

Module 2: Data Integration and Ingestion

Data Connection

Syncs and Schedules

Transaction-Aware Ingestion

Module 3: Data Transformation in Code Repositories

The Python Transforms API

Copy-Pasteable PySpark Transform

Why Incremental Processing Matters

Incremental PySpark Transform

Code Repositories vs Pipeline Builder

Module 4: The Ontology

Objects, Links, and Properties

Why Foundry Pushes the Ontology So Hard

Actions and Functions

Function-Backed Ontology Action Example

Read-Oriented Python Function Example

Module 5: Analytics and Operational Applications

Code Workspaces

Contour and Quiver

Workshop

Module 6: Security and Governance

Markings and Mandatory Controls

CBAC and PBAC in Practice

Automatic Propagation Through Lineage

Module 7: Real-World Use Cases

Scenario 1: Supply Chain Command Center

How the pieces combine

Why Foundry is strong here

Representative action

Scenario 2: Anti-Money Laundering Alerting

How the pieces combine

Where governance matters most

Representative model workflow

Foundry Patterns to Remember

Official Docs Used