Getting Started
Haetae is incremental task runner.
The task can be test, lint, build, or anything.
It can be used in any project, no matter what language, framework, test runner, linter/formatter, build system, or CI you use.
For now, in this 'Getting Started' article, we are starting from an example of incremental testing.
Why?
Let's say you're building a calculator project, named 'my-calculator'.
my-calculator
├── package.json
├── src
│ ├── add.js
│ ├── exponent.js
│ ├── multiply.js
│ └── subtract.js
└── test
├── add.test.js
├── exponent.test.js
├── multiply.test.js
└── subtract.test.js
The dependency graph is like this.
exponent.js
depends on multiply.js
, which depends on add.js
and so on.
When testing, we should take the dependency graph into account.
We do NOT have to test all files (*.test.js
) for every single tiny change (Waste of your CI resources and time).
Rather, we should do it incrementally, which means testing only files affected by the changes.
For example, when multiply.js
is changed, test only exponent.test.js
and multiply.test.js
.
When add.js
is changed, test all files (exponent.test.js
, multiply.test.js
, subtract.test.js
and add.test.js
).
When test file (e.g. add.test.js
) is changed, then just only execute the test file itself (e.g. add.test.js
).
Then how can we do it, automatically?
Here's where Haetae comes in.
By just a simple config, Haetae can automatically detect the dependency graph and test only affected files.
(You do not have to change your test runner. In this article, Jest (opens in a new tab) is used just as an example.)
Installation
So, let's install Haetae. (Node 16 or higher is required.)
It doesn't matter whether your project is new or existing (Haetae can be incrementally adapted).
It's so good for monorepo as well. (Guided later in other part of docs.)
Literally any project is proper.
npm install --save-dev haetae
Are you developing a library (e.g. plugin) for Haetae?
You can depend on @haetae/core
, @haetae/utils
,
@haetae/git
, @haetae/javascript
,
@haetae/cli
independently. Note that the package haetae
include all of them.
Basic configuration
Now, we are ready to configure Haetae.
Let's create a config file haetae.config.js
.
my-calculator
├── haetae.config.js # <--- Haetae config file
├── package.json
├── src # contents are omitted for brevity
└── test # contents are omitted for brevity
Typescript Support
If you want to write the config in typescript, name it haetae.config.ts
.
Then install ts-node
(opens in a new tab) as peerDependencies
.
You need ts-node
no matter if you actually use it directly or not.
The peerDependencies
is marked as optional, which means non-typescript users don't have to install it.
CJS/ESM
Haetae supports both CJS and ESM project.
Haetae is written in ESM, but it can be used in CJS projects as well, as long as the config file is ESM.
If your project is CJS, name the config file haetae.config.mjs
or haetae.config.mts
.
If your project is ESM, name the config file haetae.config.js
or haetae.config.ts
.
We can write it down like this.
Make sure you initialized git. Haetae can be used with any other version control systems, but using git is assumed in this article.
import { $, core, git, utils, js } from 'haetae'
export default core.configure({
// Other options are omitted for brevity.
commands: {
myTest: {
run: async () => {
// An array of changed files
const changedFiles = await git.changedFiles()
// An array of test files that (transitively) depend on changed files
const affectedTestFiles = await js.dependOn({
dependents: ['**/*.test.js'], // glob pattern
dependencies: changedFiles,
})
if (affectedTestFiles.length > 0) {
// Equals to "pnpm jest /path/to/foo.test.ts /path/to/bar.test.ts ..."
// Change 'pnpm' and 'jest' to your package manager and test runner.
await $`pnpm jest ${affectedTestFiles}`
}
},
},
},
})
Multiple APIs are used in the config file above.
They all have various options (Check out API docs).
But we are going to use their sensible defaults for now.
The Tagged Template Literal (opens in a new tab)
$
on line number 27 can run arbitrary shell commands.
If it receives a placeholder (${...}
) being an array, it automatically joins a whitespace (' '
) between elements.
It has other traits and options as well. Check out the API docs for more detail.
import { $, utils } from 'haetae'
// The following three lines of code have same effects respectively
await $`pnpm jest ${affectedTestFiles}`
await $`pnpm jest ${affectedTestFiles.join(' ')}`
// $ is a wrapper of utils.exec.
// Use utils.exec if you need a function.
// utils.exec may be easier to pass non-default options
await utils.exec(`pnpm jest ${affectedTestFiles.join(' ')}`)
In the above config, pnpm jest
is used in $
.
Just change them to your package manager and test runner.
Credit to google/zx
$
as a Tagged Template Literal is inspired by google/zx
(opens in a new tab). Thanks!
Then run haetae
like below.
$ haetae myTest
haetae
globally, you should execute it through package manager (e.g. pnpm haetae myTest
))
Note that myTest
in the command above is the name of the command we defined in the config file.
You can name it whatever you want. And as you might guess, you can define multiple commands
(e.g. myLint
, myBuild
, myIntegrationTest
, etc) in the config file.
It will print the result like this.
✔ success Command myTest is successfully executed.
⎡ 🕗 time: 2023 May 28 11:06:06 (timestamp: 1685239566483)
⎜ 🌱 env: {}
⎜ 💾 data:
⎜ "@haetae/git":
⎜ commit: 979f3c6bcafe9f0b81611139823382d615f415fd
⎜ branch: main
⎣ pkgVersion: 0.0.12
As this is the first time of running the command haetae myTest
,
git.changedFiles()
in the config returns every file tracked by git in your project as changed files
(There are options. Check out API docs after reading this article).
This behavior results in running all of the tests.
js.dependOn()
understands direct or transitive dependencies between files,
by parsing import
or require()
, etc.
So it can be used to detect which test files (transitively) depend on at least one of the changed files.
js.dependOn
can detect multiple formats
ES6(.js, .mjs), CJS(.js, .cjs), AMD, TypeScript(.ts, .mts, .cts), JSX(.jsx, .tsx), Webpack Loaders, CSS Preprocessors(Sass, Scss, Stylus, Less), PostCSS, RequireJS are all supported.
For Node.js, Subpath Imports (opens in a new tab) and Subpath Exports (opens in a new tab) are supported.
For TypeScript, Path Mapping (opens in a new tab) is also supported.
Check out the API docs and pass additional option(s) if you use Typescript or Webpack.
js.dependOn
vs js.dependsOn
vs utils.dependOn
vs utils.dependsOn
There are severel APIs of simliar purposes.
js.dependOn
: For multiple dependents. On js ecosystem.js.dependsOn
: For a single dependent. On js ecosystem.utils.dependOn
: For multiple dependents. General-purpose.utils.dependsOn
: For a single dependent. General-purpose.
Check out the API docs later for more detail.
Note that it cannot parse dynamic imports.
Dynamic or extra dependencies can be specified as additionalGraph
option, explained later in this article.
May you have noticed, the store file .haetae/store.json
is generated.
It stores history of Haetae executions, which makes incremental tasks possible.
For example, the commit ID 979f3c6
printed from the above output is the current git HEAD haetae myTest
ran on.
This information is logged in the store file to be used later.
my-calculator
├── .haetae/store.json # <--- Generated
├── haetae.config.js
├── package.json
├── src
└── test
Detecting the last commit Haetae ran on successfully
Let's say we made some changes and added 2 commits.
979f3c6
is the last commit Haetae ran on successfully.
What will happen when we run Haetae again?
$ haetae myTest
This time, only exponent.test.js
and multiply.test.js
are executed.
That's because git.changedFiles()
automatically
returns only the files changed since the last successful execution of Haetae.
For another example, if you modify add.js
, then all tests will be executed,
because js.dependOn()
detects dependency transitively.
If you modify add.test.js
, only the test file itself add.test.js
will be executed,
as every file is treated as depending on itself.
✔ success Command myTest is successfully executed.
⎡ 🕗 time: 2023 May 28 19:03:25 (timestamp: 1685268205443)
⎜ 🌱 env: {}
⎜ 💾 data:
⎜ "@haetae/git":
⎜ commit: 1d17a2f2d75e2ac94f31e53376c549751dca85fb
⎜ branch: main
⎣ pkgVersion: 0.0.12
Accordingly, the new commit 1d17a2f
is logged in the store file.
The output above is an example of successful task.
Conversely, if the test fails, pnpm jest <...>
, which we gave to $
in the config, exits with non-zero exit code.
This lets $
throws an error.
So myTest.run()
is not completed successfully, causing the store file is not renewed.
This behavior is useful for incremental tasks. The failed test (or any incremental task) will be re-executed later again until the problem is fixed.
env
configuration
Sometimes we need to separate several environments.
Simple environment variable example
For example, logic of your project might act differently depending on the environment variable $NODE_ENV
.
So, the history of an incremental task also should be recorded for each environment in a separated manner.
Let's add env
to the config file to achieve this.
import { $, core, git, utils, js } from 'haetae'
export default core.configure({
commands: {
myTest: {
env: { // <--- Add this
NODE_ENV: process.env.NODE_ENV,
},
run: async () => { /* ... */ },
},
},
})
The key name NODE_ENV
is just an example. You can name it as you want.
From now on, the store file will manage the history of each environment separately.
For example, if $NODE_ENV
can have two values, 'development'
or 'production'
,
then Haetae will manage two incremental histories for each environment.
You don't have to care about the past history of myTest
executed without env
.
When a command is configured without env
, it's treated as if configured with env: {}
, which is totally fine.
So there will be 3 env
s to be recorded in the store file:
{}
{ NODE_ENV: 'production' }
{ NODE_ENV: 'development' }
Though we changed the schema of env
in the config from {}
to { NODE_ENV: 'development' | 'production' }
,
the history of env: {}
already recorded in the store file is NOT automatically deleted.
It just stays in the store file.
This behavior is completely safe so don't worry about the past's vestige.
If you care about disk space, configuring the auto-removal of some obsolete history is guided later in this article.
Multiple keys
You can add more keys in env
object.
For instance, let's change the config to this.
import assert from 'node:assert/strict' // `node:` protocol is optional
import { $, core, git, utils, js, pkg } from 'haetae'
import semver from 'semver'
export default core.configure({
commands: {
myTest: {
env: async () => { // <--- Changed to async function from object
assert(['development', 'production'].includes(process.env.NODE_ENV))
return {
NODE_ENV: process.env.NODE_ENV,
jestConfig: await utils.hash(['jest.config.js']),
jest: (await js.version('jest')).major,
branch: await git.branch(),
os: process.platform,
node: semver.major(process.version),
haetae: pkg.version.major,
}
},
run: async () => { /* ... */ },
},
},
})
The object has more keys than before, named jestConfig
, jest
, branch
and so on.
If any of $NODE_ENV
, Jest config file, major version of Jest, git branch, OS platform, major version of Node.js,
or major version of the package haetae
is changed, it's treated as a different environment.
And now env
becomes a function. You can even freely write any additional code in it,
like assertion (assert()
) in line number 9 above. myTest.env()
is executed before myTest.run()
.
Just as like myTest.run()
, when an error is thrown in myTest.env()
,
the store file is not renewed, which is intended design for incremental tasks.
If you just want to check the value the env
function returns, you can use -e, --env
option.
This does not write to the store file, but just prints the value.
$ haetae myTest --env
✔ success Current environment is successfully executed for the command myTest
⎡ NODE_ENV: development
⎜ jestConfig: 642645d6bc72ab14a26eeae881a0fc58e0fb4a25af31e55aa9b0d134160436eb
⎜ jest: 29
⎜ branch: main
⎜ os: darwin
⎜ node: 18
⎣ haetae: 0
Additional dependency graph
Until now, js.dependOn()
is used for automatic detection of dependency graph.
But sometimes, you need to specify some dependencies manually.
Simple integration test
For example, let's say you're developing a project communicating with a database.
your-project
├── haetae.config.js
├── package.json
├── src
│ ├── external.js
│ ├── logic.js
│ └── index.js
└── test
├── data.sql
├── external.test.js
├── logic.test.js
└── index.test.js
The explicit dependency graph is like this.
logic.js
contains business logic, including communicating with a database.
external.js
communicates with a certain external service, regardless of the database.
But there is a SQL file named data.sql
for an integration test.
It's not (can't be) imported (e.g. import
, require()
) by any source code file.
Let Haetae think logic.js
depends on data.sql
, by additionalGraph
.
import { $, core, git, utils, js } from 'haetae'
export default core.configure({
commands: {
myTest: {
env: { /* ... */ },
run: async () => {
const changedFiles = await git.changedFiles()
// A graph of additional dependencies specified manually
const additionalGraph = await utils.graph({
edges: [
{
dependents: ['src/logic.js'],
dependencies: ['test/data.sql'],
},
],
})
const affectedTestFiles = await js.dependOn({
dependents: ['**/*.test.js'],
dependencies: changedFiles,
additionalGraph, // <--- New option
})
if (affectedTestFiles.length > 0) {
await $`pnpm jest ${affectedTestFiles}`
}
},
},
},
})
Then the implicit dependency graph becomes explicit.
From now on, when the file data.sql
is changed, index.test.js
and logic.test.js
. are executed.
As external.test.js
doesn't transitively depend on data.sql
, it's not executed.
Unlike this general and natural flow, if you decide that index.test.js
should never be affected by data.sql
,
you can change the config.
// Other content is omitted for brevity
const additionalGraph = await utils.graph({
edges: [
{
dependents: ['test/logic.test.js'], // 'src/logic.js' to 'test/logic.test.js'
dependencies: ['test/data.sql'],
},
],
})
By this, data.sql
doesn't affect index.test.js
anymore.
But I recommend this practice only when you're firmly sure that index.test.js
will not be related to data.sql
.
Because, otherwise, you should update the config again when the relation is changed.
env
vs additionalGraph
The effect of addtionalGraph
is different from env
.
env
is like defining parallel universes, where history is recorded separately.
If you place data.sql
in env
(e.g. with utils.hash()
) instead of additonalGraph
,
every test file will be executed when data.sql
changes,
unless the change is a rollback to past content which can be matched with a past value of env
logged in the store file (.haetae/store.json
).
external.js
and external.test.js
are regardless of database.
That's why data.sql
is applied as addtionalGraph
, not as env
.
But that's case by case. In many situations, env
is beneficial.
- If
data.sql
affects 'most' of your integration test files,
or
- If which test file does and doesn't depend on
data.sql
is not clear or the relations change frequently,
or
- If
data.sql
is not frequently changed,
then env
is a good place.
import { $, core, git, utils, js } from 'haetae'
export default core.configure({
commands: {
myTest: {
env: async () => ({
testData: await utils.hash(['test/data.sql']),
}),
run: async () => { /* ... */ }, // without `additionalGraph`
},
},
})
Cartesian product
You can specify the dependency graph from a chunk of files to another chunk.
// Other content is omitted for brevity
const additionalGraph = await utils.graph({
edges: [
{
dependents: ['test/db/*.test.js'],
dependencies: [
'test/docker-compose.yml',
'test/db/*.sql',
],
},
],
})
This means that any test file under test/db/
depends on any SQL file under test/db/
and test/docker-compose.yml
.
Distributed notation
You don't have to specify a dependent's dependencies all at once. It can be done in a distributed manner.
// Other content is omitted for brevity
const additionalGraph = await utils.graph({
edges: [
{
dependents: ['foo', 'bar'],
dependencies: ['one', 'two'],
},
{
dependents: ['foo', 'qux'], // 'foo' appears again, and it's fine
dependencies: ['two', 'three', 'bar'], // 'two' and 'bar' appear again, and it's fine
},
{
dependents: ['one', 'two', 'three'],
dependencies: ['two'], // 'two' depends on itself, and it's fine
},
{
dependents: ['foo'],
dependencies: ['one'], // 'foo' -> 'one' appears again, and it's fine
},
],
})
On line number 13-14, we marked two
depending on two
itself.
That's OK, as every file is treated as depending on itself.
So foo
depends on foo
. bar
also depends on bar
, and so on.
Circular dependency
Haetae supports circular dependency as well. Although circular dependency is, in general, considered not a good practice, it's fully up to you to decide whether to define it. Haetae does not prevent you from defining it.
// Other content is omitted for brevity
const additionalGraph = await utils.graph({
edges: [
{
dependents: ['index.js'],
dependencies: ['foo'],
},
{
dependents: ['foo'],
dependencies: ['bar'],
},
{
dependents: ['bar'],
dependencies: ['index.js'],
},
],
})
Assume the relations between index.js
, foo
, and bar
are given by additionalGraph
,
and the rests are automatically detected.
In this situation, index.test.js
is executed when any of files, except utils.test.js
, are changed, including foo
, and bar
.
On the other hand, utils.test.js
is executed only when utils.js
or utils.test.js
itself is changed.
More APIs not covered
There're more APIs related to dependency graph, like
js.graph
,
js.deps
,
utils.deps
,
utils.mergeGraph
, etc.
This article doesn't cover them all. Check out the API docs for more detail.
Record Data
Haetae has a concept of 'Record' (type: core.HaetaeRecord
)
and 'Record Data' (type: core.HaetaeRecord.data
).
In the previous sections, we've already seen terminal outputs like this.
$ haetae myTest
✔ success Command myTest is successfully executed.
⎡ 🕗 time: 2023 May 28 11:06:06 (timestamp: 1685239566483)
⎜ 🌱 env: {}
⎜ 💾 data:
⎜ "@haetae/git":
⎜ commit: 979f3c6bcafe9f0b81611139823382d615f415fd
⎜ branch: main
⎣ pkgVersion: 0.0.12
This information is logged in the store file (.haetae/store.json
), and called 'Record'.
The data
field is called 'Record Data'.
Let's check them out.
$ cat .haetae/store.json
The output is like this.
{
"version": "0.0.14",
"commands": {
"myTest": [
{
"data": {
"@haetae/git": {
"commit": "1d17a2f2d75e2ac94f31e53376c549751dca85fb",
"branch": "main",
"pkgVersion": "0.0.12"
}
},
"env": {},
"envHash": "bf21a9e8fbc5a3846fb05b4fa0859e0917b2202f",
"time": 1685239566483
},
{
"data": {
"@haetae/git": {
"commit": "a4f4e7e83eedbf2269fbf29d91f08289bdeece91",
"branch": "main",
"pkgVersion": "0.0.12"
}
},
"env": {
"NODE_ENV": "production"
},
"envHash": "4ed28f8415aeb22c021e588c70d821cb604c7ae0",
"time": 1685458529856
},
{
"data": {
"@haetae/git": {
"commit": "442fefc582889bdaee5ec2bd8b74804680fc30ee",
"branch": "main",
"pkgVersion": "0.0.12"
}
},
"env": {
"NODE_ENV": "development"
},
"envHash": "2b580e42012efb489cdea43194c9dd6aed6b77d8",
"time": 1685452061199
},
{
"data": {
"@haetae/git": {
"commit": "ef3fdf88e9fad90396080335096a88633fbe893f",
"branch": "main",
"pkgVersion": "0.0.12"
}
},
"env": {
"jestConfig": "642645d6bc72ab14a26eeae881a0fc58e0fb4a25af31e55aa9b0d134160436eb",
"jest": 29,
"branch": "main",
"os": "darwin",
"node": 18,
"haetae": 0
},
"envHash": "62517924fb2c6adb38b4f30ba75a513066f5ac80",
"time": 1685455507556
},
{
"data": {
"@haetae/git": {
"commit": "7e3b332f0657272cb277c312ff25d4e1145f895c",
"branch": "main",
"pkgVersion": "0.0.12"
}
},
"env": {
"testData": "b87b8be8df58976ee7da391635a7f45d8dc808357ff63fdcda699df937910227"
},
"envHash": "7ea1923c8bad940a97e1347ab85abd4811e82531",
"time": 1685451151035
}
]
}
}
Env Hash
The field envHash
is SHA-1 of env
object.
The env
object is serialized by a deterministic method no matter how deep it is, and calculated as a hash.
The hash is used to match the current env
with previous records.
SHA-1 is considered insecure, but good enough to prevent collision for history comparison.
For example, git
also uses SHA-1 as a commit ID.
When your Env or Record Data contains a confidential field and you're worrying what if the store is leaked,
you can preprocess secret fields with a stronger cryptographic hash algorithm,
like SHA-256 or SHA-512.
The practical guide with utils.hash()
is explained just in the next section.
recordRemoval.leaveOnlyLastestPerEnv
of localStore
By default, you're using localStore
as a 'Store Connector'.
localStore
stores records into a file (.haetae/store.json
).
The option recordRemoval.leaveOnlyLastestPerEnv
is true
by default.
So only the last records per env
exist in the store file.
This is useful when you only depend on the latest Records.
To utilize further past Records, you can set the option false
.
Changing or configuring 'Store Connector' is guided later.
5 Records are found in total.
These are what we've done in this article so far.
Each of these is the last history of Records executed in each env
respectively.
For example, the command myTest
was executed with env: {}
on several commits,
and 1d17a2f
is the last commit.
Custom Record Data
Configuration files for your application is a good example showing the usefulness of Record Data.
I mean a config file not for Haetae, but for your project itself.
To say, dotenv (.env
), .yaml, .properties, .json, etc.
Usually, an application config file satisfies these 2 conditions.
- It's not explicitly imported (e.g.
import
,require()
) in the source code. Rather, the source code 'reads' it on runtime. --->additionalGraph
orenv
are useful. - It's ignored by git. ---> 'Record Data' is useful.
Let's see how it works, with a simple example project using .env
as the application config.
dotenv
.env
is a configuration file for environment variables, and NOT related to Haetae's env
at all.
your-project
├── .env # <--- dotenv file
├── .gitignore # <--- ignores '.env' file
├── haetae.config.js
├── package.json
├── src
│ ├── config.js
│ ├── utils.js
│ ├── logic.js
│ └── index.js
└── test
├── utils.test.js
├── logic.test.js
└── index.test.js
src/config.js
reads the file .env
, by a library dotenv (opens in a new tab) for example.
import { config } from 'dotenv'
config()
export default {
port: process.env.PORT,
secretKey: process.env.SECRET_KEY,
}
Let's assume logic.js
gets the value of environment variables through config.js
, not directly reading from .env
or process.env
.
The explicit source code dependency graph is like this.
Let Haetae think config.js
depends on .env
.
import { $, core, git, utils, js } from 'haetae'
export default core.configure({
commands: {
myTest: {
env: { /* ... */ },
run: async () => {
const changedFiles = await git.changedFiles()
const additionalGraph = await utils.graph({
edges: [
{
dependents: ['src/config.js'],
dependencies: ['.env'],
},
],
})
const affectedTestFiles = await js.dependOn({
dependents: ['**/*.test.js'],
dependencies: changedFiles,
additionalGraph,
})
if (affectedTestFiles.length > 0) {
await $`pnpm jest ${affectedTestFiles}`
}
},
},
},
})
Then the implicit dependency graph becomes explicit.
But that's now enough, because .env
is ignored by git.
git.changedFiles()
cannot detect if .env
changed or not.
Let's use 'Record Data' to solve this problem. Add these into the config file like this.
import { $, core, git, utils, js } from 'haetae'
export default core.configure({
commands: {
myTest: {
env: { /* ... */ },
run: async () => {
const changedFiles = await git.changedFiles()
const previousRecord = await core.getRecord()
const dotenvHash = await utils.hash(['.env'])
if (previousRecord?.data?.dotenv !== dotenvHash) {
changedFiles.push('.env')
}
const additionalGraph = await utils.graph({
edges: [
{
dependents: ['src/config.js'],
dependencies: ['.env'],
},
],
})
const affectedTestFiles = await js.dependOn({
dependents: ['**/*.test.js'],
dependencies: changedFiles,
additionalGraph,
})
if (affectedTestFiles.length > 0) {
await $`pnpm jest ${affectedTestFiles}`
}
return {
dotenv: dotenvHash
}
},
},
},
})
Now, we return an object from myTest.run
.
Let's execute it.
$ haetae myTest
✔ success Command myTest is successfully executed.
⎡ 🕗 time: 2023 Jun 08 09:23:07 (timestamp: 1686183787453)
⎜ 🌱 env: {}
⎜ 💾 data:
⎜ "@haetae/git":
⎜ commit: ac127da6531efa487b8ee35451f24a70dc58aeea
⎜ branch: main
⎜ pkgVersion: 0.0.12
⎣ dotenv: 7f39224e335994886c26ba8c241fcbe1d474aadaa2bd0a8e842983b098cea894
Do you see the last line?
The value we returned from myTest.run
is recorded in the store file, as part of Record Data.
Hash confidential
utils.hash()
is good for secrets like a dotenv file.
By default, it hashes by SHA-256, and you can simply change the cryptographic hash algorithm by its options, like to SHA-512 for example.
Thus, you do not need to worry about if the store file is leaked.
This time, .env
was treated as a changed file, as the key dotenv
did not exist from previousRecord
.
// Other content is omitted for brevity
if (previousRecord?.data?.dotenv !== dotenvHash) {
changedFiles.push('.env')
}
Therefore, index.test.js
and logic.test.js
, which transitively depend on .env
, are executed.
If you run Haetae again immediately,
$ haetae myTest
This time, no test is executed, as nothing is considered changed. .env
is treated as not changed, thanks to the Record Data.
From now on, though the file .env
is ignored by git, changes to it are recorded by custom Record Data.
So it can be used in incremental tasks.
Reserved Record Data
We can enhance the workflow further.
import { $, core, git, utils, js } from 'haetae'
export default core.configure({
commands: {
myTest: {
env: { /* ... */ },
run: async () => {
const changedFiles = await git.changedFiles()
const changedFilesByHash = await utils.changedFiles(['.env'])
changedFiles.push(...changedFilesByHash)
const additionalGraph = await utils.graph({
edges: [
{
dependents: ['src/config.js'],
dependencies: ['.env'],
},
],
})
const affectedTestFiles = await js.dependOn({
dependents: ['**/*.test.js'],
dependencies: changedFiles,
additionalGraph,
})
if (affectedTestFiles.length > 0) {
await $`pnpm jest ${affectedTestFiles}`
}
// No return value
},
},
},
})
We return nothing here.
We do not calculate hash by ourselves.
But this has the same effect as what we've done in the previous section.
$ haetae myTest
✔ success Command myTest is successfully executed.
⎡ 🕗 time: 2023 Jun 11 00:27:40 (timestamp: 1686410860187)
⎜ 🌱 env: {}
⎜ 💾 data:
⎜ "@haetae/git":
⎜ commit: 018dd7e0c65c3a9d405485df7949ef75ff96e757
⎜ branch: main
⎜ pkgVersion: 0.0.13
⎜ "@haetae/utils":
⎜ files:
⎜ .env: 7f39224e335994886c26ba8c241fcbe1d474aadaa2bd0a8e842983b098cea894
⎣ pkgVersion: 0.0.14
You can see the hash of .env
is recorded.
utils.changedFiles
automatically writes hash in Record Data,
and compares the current hash with the previous one.
How is this possible?
There's a concept of Reseved Record Data.
If you call core.reserveRecordData
,
you can 'reserve' Record Data without directly returning custom Record Data from the command's run
function.
git.changedFiles
and utils.changedFiles
call core.reserveRecordData
internally.
This mechanism can be especially useful for sharable generic features, like a 3rd-party library for Haetae.
For that, it's important to avoid naming collision.
Record Data can have arbitrary fields.
So Haetae uses a package name as a namespace by convention.
'@haetae/git'
and '@haetae/utils'
keys in Record Data are namespaces to avoid such a collision.
utils.changedFiles
is more useful for multiple files.
Let's say you have multiple dotenv files per environment, unlike the previous assumption.
For example, .env.local
, .env.development
, and .env.staging
are targets to test.
Now, config.js
reads .env.${process.env.ENV}
,
where $ENV
is an indicater of environment: 'local'
, 'development'
or 'staging'
.
Then we can modify the config file like this.
import { $, core, git, utils, js } from 'haetae'
export default core.configure({
commands: {
myTest: {
env: { /* ... */ },
run: async () => {
const changedFiles = await git.changedFiles()
const changedFilesByHash = await utils.changedFiles(
['.env.*'], // or ['.env.{local,development,staging}']
{
renew: [`.env.${process.env.ENV}`],
},
)
changedFiles.push(...changedFilesByHash)
const additionalGraph = await utils.graph({
edges: [
{
dependents: ['src/config.js'],
dependencies: [`.env.${process.env.ENV}`],
},
],
})
const affectedTestFiles = await js.dependOn({
dependents: ['**/*.test.js'],
dependencies: changedFiles,
additionalGraph,
})
if (affectedTestFiles.length > 0) {
await $`pnpm jest ${affectedTestFiles}`
}
},
},
},
})
renew
is a list of files (or glob pattern)
that will be renewed (if changed) by their current hash.
By default, renew
is equal to all files(['env.*']
) we gave as the argument.
In our config, by limiting it to .env.${process.env.ENV}
, you only renew the single dotenv file.
Let's say currently $ENV
is 'local'
.
Obviously, .env.local
, .env.development
, and .env.staging
are compared to the previous hashes.
If changes are detected, included in the result array.
But regardless of it, .env.development
, and .env.staging
are not renewed in the new Record Data.
Their previous hashes are recorded instead of current hashes.
This behavior can be good for our test in many scenarios.
For instance, .env.development
can be modified when $ENV
is 'local'
.
As it's not in renew
list, the hash of .env.development
is not updated.
When later $ENV
becomes 'development'
, utils.changedFiles
would still think .env.development
is a changed file,
as the current hash and previously recorded hash are different.
This makes sure test files are to be re-executed when $ENV
becomes 'development'
.
renew
exists for the discrepancy between
when the physical change actually happens and when the detection of the change is needed.
utils.changedFiles
has many more options,
and acts in a bit complicated way.
For example, by an option keepRemovedFiles
, which is not introduced above,
you can handle cases like when not all of the files might exist on the filesystem at the same time
and only a few of them are dynamically used in incremental tasks.
For instance, a CI workflow might have access to only .env.development
at a certain time,
while it might have access to only .env.staging
at another time.
And you may still want the incremental history not separated but shared between the two cases.
That's where keepRemovedFiles
comes in.
utils.changedFiles
is not covered thoroughly here.
Check out the API docs for more detail.
There's one more thing to take care of utils.changedFiles
.
You should NOT give a dynamic files argument to it.
Otherwise, a file would be treated as changed every time the dynamic argument changes.
// Other content is omitted for brevity
const changedFilesByHash = await utils.changedFiles(
[`.env.${process.env.ENV}`] // <--- Anti-pattern
)
The snippet above lets only a single file to be recorded.
So, if $ENV
is changed, the previous file is no longer recorded.
This has no safety problem, but reduces incrementality.
Therefore you should list all of the candidates, like ['.env.*']
.
Root Env and Root Record Data
Haetae has a concept of 'Root Env' (type: core.RootEnv
)
and 'Root Record Data' (type: core.RootRecordData
).
They are decorater (opens in a new tab)-like transformers
for the return value of env
and run
of every command.
import { $, core, git, utils, js } from 'haetae'
export default core.configure({
recordData: async (data) => ({ // <--- 'Root Record Data'
hello: data.hello.toUpperCase(),
}),
commands: {
myGreeting: {
run: () => ({ hello: 'world' }),
},
},
})
$ haetae myGreeting
✔ success Command myGreeting is successfully executed.
⎡ 🕗 time: 2023 Jun 14 15:49:52 (timestamp: 1686725392672)
⎜ 🌱 env: {}
⎜ 💾 data:
⎣ hello: WORLD # <--- capitalized
Let's get into a more practical example.
You may want the config file's hash to be automatically recorded into every command's env
.
import * as url from 'node:url'
import { $, core, git, utils, js } from 'haetae'
export default core.configure({
env: async (env) => ({ // <--- 'Root Env'
...env,
// Equals to => await utils.hash(['haetae.config.js']),
haetaeConfig: await utils.hash([url.fileURLToPath(import.meta.url)]),
}),
commands: {
myGreeting: {
env: {
NODE_ENV: process.env.NODE_ENV
},
run: () => { /* ... */ }
},
},
})
By Root Env, it's done in a single place.
$ haetae myGreeting --env
✔ success Current environment is successfully executed for the command myGreeting
⎡ NODE_ENV: development
⎣ haetaeConfig: f7c12d5131846a5db496b87cda59d3e07766ed1bde8ed159538e85f42f3a8dae
By the way, you can go even thoroughly.
js.deps
lists every direct and transitive dependency.
// Other content is omitted for brevity
haetaeConfig: await utils.hash(
await js.deps({ entrypoint: url.fileURLToPath(import.meta.url) }),
),
This snippet calculates a hash of the config file and its dependencies. If you don't import other modules in the config, this is not necessary.