add support for nvidia gpu access (#5132)

Signed-off-by: Mondo <mondo.jiang@wisc.edu>
Signed-off-by: Jean-Yves <7360784+docjyJ@users.noreply.github.com>
Signed-off-by: Simon L. <szaimen@e.mail.de>
Co-authored-by: Jean-Yves <7360784+docjyJ@users.noreply.github.com>
Co-authored-by: Simon L. <szaimen@e.mail.de>
This commit is contained in:
MondoGao 2024-12-20 04:12:59 -06:00 committed by GitHub
parent 119d03694e
commit 53edc5d4a9
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
17 changed files with 92 additions and 11 deletions

View file

@ -16,6 +16,10 @@
"aio_variables": [
"nextcloud_memory_limit=2048M"
],
"devices": [
"/dev/dri"
],
"enable_nvidia_gpu": true,
"nextcloud_exec_commands": [
"php /var/www/html/occ app:install facerecognition",
"php /var/www/html/occ app:enable facerecognition",

View file

@ -31,6 +31,7 @@
"devices": [
"/dev/dri"
],
"enable_nvidia_gpu": true,
"backup_volumes": [
"nextcloud_aio_jellyfin"
]

View file

@ -29,6 +29,10 @@
"writeable": false
}
],
"devices": [
"/dev/dri"
],
"enable_nvidia_gpu": true,
"nextcloud_exec_commands": [
"mkdir '/mnt/ncdata/admin/files/nextcloud-aio-local-ai'",
"touch '/mnt/ncdata/admin/files/nextcloud-aio-local-ai/models.yaml'",

View file

@ -27,6 +27,7 @@
"devices": [
"/dev/dri"
],
"enable_nvidia_gpu": true,
"nextcloud_exec_commands": [
"php /var/www/html/occ app:install memories",
"php /var/www/html/occ app:enable memories",

View file

@ -2,7 +2,7 @@
This container bundles the hardware-transcoding container of memories and auto-configures it for you.
### Notes
- In order to actually enable the hardware transcoding, you need to add the following flag to AIO apart from adding this container: https://github.com/nextcloud/all-in-one#how-to-enable-hardware-transcoding-for-nextcloud
- In order to actually enable the hardware transcoding, you need to add the following flag to AIO apart from adding this container: https://github.com/nextcloud/all-in-one#how-to-enable-hardware-acceleration-for-nextcloud
- See https://github.com/nextcloud/all-in-one/tree/main/community-containers#community-containers how to add it to the AIO stack
### Repository

View file

@ -33,6 +33,7 @@
"devices": [
"/dev/dri"
],
"enable_nvidia_gpu": true,
"backup_volumes": [
"nextcloud_aio_plex"
]

View file

@ -29,7 +29,8 @@ services:
# NEXTCLOUD_STARTUP_APPS: deck twofactor_totp tasks calendar contacts notes # Allows to modify the Nextcloud apps that are installed on starting AIO the first time. See https://github.com/nextcloud/all-in-one#how-to-change-the-nextcloud-apps-that-are-installed-on-the-first-startup
# NEXTCLOUD_ADDITIONAL_APKS: imagemagick # This allows to add additional packages to the Nextcloud container permanently. Default is imagemagick but can be overwritten by modifying this value. See https://github.com/nextcloud/all-in-one#how-to-add-os-packages-permanently-to-the-nextcloud-container
# NEXTCLOUD_ADDITIONAL_PHP_EXTENSIONS: imagick # This allows to add additional php extensions to the Nextcloud container permanently. Default is imagick but can be overwritten by modifying this value. See https://github.com/nextcloud/all-in-one#how-to-add-php-extensions-permanently-to-the-nextcloud-container
# NEXTCLOUD_ENABLE_DRI_DEVICE: true # This allows to enable the /dev/dri device in the Nextcloud container. ⚠️⚠️⚠️ Warning: this only works if the '/dev/dri' device is present on the host! If it should not exist on your host, don't set this to true as otherwise the Nextcloud container will fail to start! See https://github.com/nextcloud/all-in-one#how-to-enable-hardware-transcoding-for-nextcloud
# NEXTCLOUD_ENABLE_DRI_DEVICE: true # This allows to enable the /dev/dri device in the Nextcloud container. ⚠️⚠️⚠️ Warning: this only works if the '/dev/dri' device is present on the host! If it should not exist on your host, don't set this to true as otherwise the Nextcloud container will fail to start! See https://github.com/nextcloud/all-in-one#how-to-enable-hardware-acceleration-for-nextcloud
# ENABLE_NVIDIA_GPU: true # This allows to enable the NVIDIA runtime and GPU access for containers that profit from it. ⚠️⚠️⚠️ Warning: this only works if an NVIDIA gpu is installed on the server. See https://github.com/nextcloud/all-in-one#how-to-enable-hardware-acceleration-for-nextcloud.
# NEXTCLOUD_KEEP_DISABLED_APPS: false # Setting this to true will keep Nextcloud apps that are disabled in the AIO interface and not uninstall them if they should be installed. See https://github.com/nextcloud/all-in-one#how-to-keep-disabled-apps
# SKIP_DOMAIN_VALIDATION: false # This should only be set to true if things are correctly configured. See https://github.com/nextcloud/all-in-one?tab=readme-ov-file#how-to-skip-the-domain-validation
# TALK_PORT: 3478 # This allows to adjust the port that the talk container is using. See https://github.com/nextcloud/all-in-one#how-to-adjust-the-talk-port

View file

@ -15,6 +15,7 @@ OUTPUT="$(cat /tmp/containers.json)"
OUTPUT="$(echo "$OUTPUT" | jq 'del(.services[].internal_port)')"
OUTPUT="$(echo "$OUTPUT" | jq 'del(.services[].secrets)')"
OUTPUT="$(echo "$OUTPUT" | jq 'del(.services[].devices)')"
OUTPUT="$(echo "$OUTPUT" | jq 'del(.services[].enable_nvidia_gpu)')"
OUTPUT="$(echo "$OUTPUT" | jq 'del(.services[].backup_volumes)')"
OUTPUT="$(echo "$OUTPUT" | jq 'del(.services[].nextcloud_exec_commands)')"
OUTPUT="$(echo "$OUTPUT" | jq 'del(.services[].image_tag)')"

View file

@ -160,6 +160,9 @@
"pattern": "^/dev/[a-z]+$"
}
},
"enable_nvidia_gpu": {
"type": "boolean"
},
"apparmor_unconfined": {
"type": "boolean"
},

View file

@ -263,6 +263,7 @@
"devices": [
"/dev/dri"
],
"enable_nvidia_gpu": true,
"backup_volumes": [
"nextcloud_aio_nextcloud"
],
@ -520,6 +521,10 @@
"profiles": [
"talk-recording"
],
"devices": [
"/dev/dri"
],
"enable_nvidia_gpu": true,
"networks": [
"nextcloud-aio"
],

View file

@ -125,6 +125,7 @@ $app->get('/containers', function (Request $request, Response $response, array $
'nextcloud_max_time' => $configurationManager->GetNextcloudMaxTime(),
'nextcloud_memory_limit' => $configurationManager->GetNextcloudMemoryLimit(),
'is_dri_device_enabled' => $configurationManager->isDriDeviceEnabled(),
'is_nvidia_gpu_enabled' => $configurationManager->isNvidiaGpuEnabled(),
'is_talk_recording_enabled' => $configurationManager->isTalkRecordingEnabled(),
'is_docker_socket_proxy_enabled' => $configurationManager->isDockerSocketProxyEnabled(),
'is_whiteboard_enabled' => $configurationManager->isWhiteboardEnabled(),

View file

@ -23,6 +23,7 @@ readonly class Container {
private array $secrets,
/** @var string[] */
private array $devices,
private bool $enable_nvidia_gpu,
/** @var string[] */
private array $capAdd,
private int $shmSize,
@ -92,6 +93,10 @@ readonly class Container {
return $this->devices;
}
public function isNvidiaGpuEnabled() : bool {
return $this->enable_nvidia_gpu;
}
public function GetCapAdds() : array {
return $this->capAdd;
}

View file

@ -249,6 +249,11 @@ readonly class ContainerDefinitionFetcher {
$devices = $entry['devices'];
}
$enableNvidiaGpu = false;
if (is_bool($entry['enable_nvidia_gpu'])) {
$enableNvidiaGpu = $entry['enable_nvidia_gpu'];
}
$capAdd = [];
if (isset($entry['cap_add'])) {
$capAdd = $entry['cap_add'];
@ -312,6 +317,7 @@ readonly class ContainerDefinitionFetcher {
$dependsOn,
$secrets,
$devices,
$enableNvidiaGpu,
$capAdd,
$shmSize,
$apparmorUnconfined,

View file

@ -983,6 +983,17 @@ class ConfigurationManager
}
}
private function GetEnabledNvidiaGpu() : string {
$envVariableName = 'ENABLE_NVIDIA_GPU';
$configName = 'enable_nvidia_gpu';
$defaultValue = '';
return $this->GetEnvironmentalVariableOrConfig($envVariableName, $configName, $defaultValue);
}
public function isNvidiaGpuEnabled() : bool {
return $this->GetEnabledNvidiaGpu() === 'true';
}
private function GetKeepDisabledApps() : string {
$envVariableName = 'NEXTCLOUD_KEEP_DISABLED_APPS';
$configName = 'nextcloud_keep_disabled_apps';

View file

@ -491,6 +491,17 @@ readonly class DockerActionManager {
$requestBody['HostConfig']['Devices'] = $devices;
}
if ($container->isNvidiaGpuEnabled() && $this->configurationManager->isNvidiaGpuEnabled()) {
$requestBody['HostConfig']['Runtime'] = 'nvidia';
$requestBody['HostConfig']['DeviceRequests'] = [
[
"Driver" => "nvidia",
"Count" => 1,
"Capabilities" => [["gpu"]],
]
];
}
$shmSize = $container->GetShmSize();
if ($shmSize > 0) {
$requestBody['HostConfig']['ShmSize'] = $shmSize;

View file

@ -29,12 +29,16 @@
<p>Nextcloud has a timeout of {{ nextcloud_max_time }} seconds configured (important for big file uploads). See the <a href="https://github.com/nextcloud/all-in-one#how-to-adjust-the-max-execution-time-for-nextcloud">NEXTCLOUD_MAX_TIME documentation</a> on how to change this.</p>
<p>
{% if is_dri_device_enabled == true %}
The /dev/dri device which is needed for hardware transcoding is getting attached to the Nextcloud container.
{% if is_dri_device_enabled == true and is_nvidia_gpu_enabled == true %}
Hardware acceleration is enabled with the /dev/dri device and the Nvidia runtime.
{% elseif is_dri_device_enabled == true %}
Hardware acceleration is enabled with the /dev/dri device.
{% elseif is_nvidia_gpu_enabled == true %}
Hardware acceleration is enabled with the Nvidia runtime.
{% else %}
The /dev/dri device which is needed for hardware transcoding is not attached to the Nextcloud container.
Hardware acceleration is not enabled. It's recommended to enable hardware transcoding for better performance.
{% endif %}
See the <a href="https://github.com/nextcloud/all-in-one#how-to-enable-hardware-transcoding-for-nextcloud">NEXTCLOUD_ENABLE_DRI_DEVICE documentation</a> on how to change this.</p>
See the <a href="https://github.com/nextcloud/all-in-one#how-to-enable-hardware-acceleration-for-nextcloud">hardware acceleration documentation</a> on how to change this.</p>
<p>For further documentation on AIO, refer to <strong><a href="https://github.com/nextcloud/all-in-one#nextcloud-all-in-one">this page</a></strong>. You can use the browser search [CTRL]+[F] to search through the documentation. Additional documentation can be found <strong><a href="https://github.com/nextcloud/all-in-one/discussions/categories/wiki">here</a></strong>.</p>
</details>

View file

@ -44,7 +44,7 @@ Included are:
- `ffmpeg`, `smbclient`, `libreoffice` and `nodejs` are included by default
- Possibility included to [permanently add additional OS packages into the Nextcloud container](https://github.com/nextcloud/all-in-one#how-to-change-the-nextcloud-apps-that-are-installed-on-the-first-startup) without having to build your own Docker image
- Possibility included to [permanently add additional PHP extensions into the Nextcloud container](https://github.com/nextcloud/all-in-one#how-to-add-php-extensions-permanently-to-the-nextcloud-container) without having to build your own Docker image
- Possibility included to [pass the needed device for hardware transcoding](https://github.com/nextcloud/all-in-one#how-to-enable-hardware-transcoding-for-nextcloud) to the Nextcloud container
- Possibility included to [pass the needed device for hardware transcoding](https://github.com/nextcloud/all-in-one#how-to-enable-hardware-acceleration-for-nextcloud) to the Nextcloud container
- Possibility included to [store all docker related files on a separate drive](https://github.com/nextcloud/all-in-one#how-to-store-the-filesinstallation-on-a-separate-drive)
- [Additional features can be added very easily](https://github.com/nextcloud/all-in-one/tree/main/community-containers#community-containers)
- [LDAP can be used as user backend for Nextcloud](https://github.com/nextcloud/all-in-one/tree/main#ldap)
@ -765,11 +765,33 @@ You can do so by adding `--env NEXTCLOUD_ADDITIONAL_PHP_EXTENSIONS="imagick exte
### What about the pdlib PHP extension for the facerecognition app?
The [facerecognition app](https://apps.nextcloud.com/apps/facerecognition) requires the pdlib PHP extension to be installed. Unfortunately, it is not available on PECL nor via PHP core, so there is no way to add this into AIO currently. However you can use [this community container](https://github.com/nextcloud/all-in-one/tree/main/community-containers/facerecognition) in order to run facerecognition.
### How to enable hardware-transcoding for Nextcloud?
> [!WARNING]
> This only works if the `/dev/dri` device is present on the host! If it does not exists on your host, don't proceed as otherwise the Nextcloud container will fail to start! If you are unsure about this, better do not proceed with the instructions below.
### How to enable hardware acceleration for Nextcloud?
Some container can use GPU acceleration to increase performance like [memories app](https://apps.nextcloud.com/apps/memories) allows to enable hardware transcoding for videos.
The [memories app](https://apps.nextcloud.com/apps/memories) allows to enable hardware transcoding for videos. In order to use that, you need to add `--env NEXTCLOUD_ENABLE_DRI_DEVICE=true` to the docker run command of the mastercontainer (but before the last line `nextcloud/all-in-one:latest`! If it was started already, you will need to stop the mastercontainer, remove it (no data will be lost) and recreate it using the docker run command that you initially used) which will mount the `/dev/dri` device into the container. There is now a community container which allows to easily add the transcoding container of Memories to AIO: https://github.com/nextcloud/all-in-one/tree/main/community-containers/memories
#### With open source drivers MESA for AMD, Intel and **new** drivers `Nouveau` for Nvidia
> [!WARNING]
> This only works if the `/dev/dri` device is present on the host! If it does not exist on your host, don't proceed as otherwise the Nextcloud container will fail to start! If you are unsure about this, better do not proceed with the instructions below. Make sure that your driver is correctly configured on the host.
A list of supported device can be fond in [MESA 3D documentation](https://docs.mesa3d.org/systems.html).
This method use the [Direct Rendering Infrastructure](https://dri.freedesktop.org/wiki/) with the access to the `/dev/dri` device.
In order to use that, you need to add `--env NEXTCLOUD_ENABLE_DRI_DEVICE=true` to the docker run command of the mastercontainer (but before the last line `nextcloud/all-in-one:latest`! If it was started already, you will need to stop the mastercontainer, remove it (no data will be lost) and recreate it using the docker run command that you initially used) which will mount the `/dev/dri` device into the container.
#### With proprietary drivers for Nvidia :warning: BETA
> [!WARNING]
> This only works if the Nvidia Toolkit is installed on the host and an NVIDIA GPU is enabled! Make sure that it is correctly configured on the host. If it does not exist on your host, don't proceed as otherwise the Nextcloud container will fail to start! If you are unsure about this, better do not proceed with the instructions below.
>
> This feature is in beta. Since the proprietary, we haven't a lot of user using proprietary drivers, we can't guarantee the stability of this feature. Your feedback is welcome.
This method use the [Nvidia Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/index.html) with the nvidia runtime.
In order to use that, you need to add `--env ENABLE_NVIDIA_GPU=true` to the docker run command of the mastercontainer (but before the last line `nextcloud/all-in-one:latest`! If it was started already, you will need to stop the mastercontainer, remove it (no data will be lost) and recreate it using the docker run command that you initially used) which will enable the nvidia runtime.
If you're using WSL2 and want to use the NVIDIA runtime, please follow the instructions to [install the NVIDIA Container Toolkit meta-version in WSL](https://docs.nvidia.com/cuda/wsl-user-guide/index.html#cuda-support-for-wsl-2).
### How to keep disabled apps?
In certain situations you might want to keep Nextcloud apps that are disabled in the AIO interface and not uninstall them if they should be installed in Nextcloud. You can do so by adding `--env NEXTCLOUD_KEEP_DISABLED_APPS=true` to the docker run command of the mastercontainer (but before the last line `nextcloud/all-in-one:latest`! If it was started already, you will need to stop the mastercontainer, remove it (no data will be lost) and recreate it using the docker run command that you initially used).