The crew
package has unavoidable risks, and the user is
responsible for safety, security, and computational resources. This
vignette describes known risks and safeguards, but is by no means
exhaustive. Please read the software
license.
The crew
package launches external R processes:
mirai
dispatcher process to schedule the tasks. If x
is a
crew
controller, the process ID of the dispatcher is
x$client$dispatcher
.In the event of a poorly-timed crash or network error, these processes may not terminate properly. If that happens, they will continue to run, which may strain traditional clusters or incur heavy expenses on the cloud. Please monitor the platforms you use and manually terminate defunct hanging processes as needed.
In addition, crew
occupies one TCP port per controller.
TCP ports range from 0 to 65535, and only around 16000 of these ports
are considered ephemeral or dynamic, so please be careful not to run too
many controllers simultaneously on shared machines, especially in controller
group. The terminate()
frees these ports again for
other processes to use.
By default, crew
uses unencrypted TCP connections for
transactions among workers. In a compromised network, an attacker can
read the data in transit, and even gain direct access to the client or
host.
It is best to avoid persistent direct connections between your local
computer and the public internet. The host
argument of the
controller should not be a public IP address. Instead, please try to
operate entirely within a perimeter such as a firewall, a virtual
private network (VPN), or an Amazon Web Services (AWS) security group.
In the case of AWS, your security group can open ports to itself. That
way, the crew
workers on e.g. AWS Batch jobs can connect to
a crew
client running in the same security group on an AWS
Batch job or EC2 instance.
In the age of Zero Trust, perimeters alone are seldom sufficient. Transport layer security (TLS) encrypts data to protect it from hackers while it travels over a network. TLS is the state of the art of encryption for network communications, and it is responsible for security in popular protocols such as HTTPS and SSH. TLS is based on public key cryptography, which requires two files:
To use TLS in crew
with automatic configuration, simply
set tls = crew_tls(mode = "automatic")
in the controller,
e.g. crew_controller_local()
.1 mirai
generates a one-time key pair and encrypts data for the current
crew
client. The key pair expires when the client
terminates, which reduces the risk of a breach. In addition, the public
key is a self-signed certificate, which somewhat protects against
tampering on its way from the client to the server.
Launcher
plugins should expose the tls
argument of
crew_client()
.↩︎