Compare commits
16 Commits
dev
...
7fea170597
| Author | SHA1 | Date | |
|---|---|---|---|
|
7fea170597
|
|||
|
029ba3cc90
|
|||
|
c8dd6e9c4a
|
|||
|
afb38a067b
|
|||
|
2540c04782
|
|||
|
03e2449be3
|
|||
|
ad034a9a6f
|
|||
|
6db371d513
|
|||
|
ea0f56c202
|
|||
|
95156de4ca
|
|||
|
20e701b754
|
|||
|
615d3fafa0
|
|||
|
a000af5c8b
|
|||
|
b565224fcd
|
|||
|
f7dcad115d
|
|||
| 4e41518cf4 |
@@ -6,6 +6,9 @@ PORT=1337
|
|||||||
STRAPI_URL=
|
STRAPI_URL=
|
||||||
STRAPI_ADMIN_URL=/admin
|
STRAPI_ADMIN_URL=/admin
|
||||||
|
|
||||||
|
# Branding (affiché dans l'interface admin Strapi — rebuild requis)
|
||||||
|
STRAPI_ADMIN_SITE_NAME=OKI
|
||||||
|
|
||||||
APP_KEYS=
|
APP_KEYS=
|
||||||
API_TOKEN_SALT=
|
API_TOKEN_SALT=
|
||||||
ADMIN_JWT_SECRET=
|
ADMIN_JWT_SECRET=
|
||||||
|
|||||||
@@ -56,6 +56,26 @@ Si la variable JWT_SECRET n'est pas renseignée, elle est générée automatique
|
|||||||
yarn && yarn build && yarn dev
|
yarn && yarn build && yarn dev
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Tokens API
|
||||||
|
|
||||||
|
Les endpoints marqués ⚙️ **Token requis** nécessitent un token API Strapi.
|
||||||
|
|
||||||
|
**Créer le token** : Administration Strapi → *Settings → API Tokens → Create new API Token*
|
||||||
|
|
||||||
|
| Endpoint | Type de token | Permissions requises |
|
||||||
|
|----------|---------------|----------------------|
|
||||||
|
| `GET /paroles/export` | Custom | Parole → `export` |
|
||||||
|
| `POST /paroles/bulk-translate` | Custom | Parole → `bulkTranslate` |
|
||||||
|
|
||||||
|
Pour un token couvrant les deux endpoints, créer un token de type **Custom** et cocher dans la section *Parole* : `export` et `bulkTranslate`.
|
||||||
|
|
||||||
|
Le token est à passer dans le header HTTP :
|
||||||
|
```
|
||||||
|
Authorization: Bearer <token>
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Point d'accès
|
## Point d'accès
|
||||||
|
|
||||||
### `/awtis`
|
### `/awtis`
|
||||||
@@ -77,6 +97,72 @@ ___
|
|||||||
### `/paroles/count`
|
### `/paroles/count`
|
||||||
- `GET` : Récupère le nombre de texte
|
- `GET` : Récupère le nombre de texte
|
||||||
|
|
||||||
|
### `/paroles/bulk-translate` ⚙️ Token requis
|
||||||
|
- `POST` : Traduit automatiquement via DeepL toutes les paroles ayant une source française (`traductions.francais` ou `langueSource: fr`) vers les langues manquantes (EN, ES, DE, IT). Ne modifie pas les traductions déjà existantes.
|
||||||
|
|
||||||
|
**Réponse :**
|
||||||
|
```json
|
||||||
|
{"translated": 42, "skipped": 18, "errors": []}
|
||||||
|
```
|
||||||
|
|
||||||
|
| Champ | Description |
|
||||||
|
|-------|-------------|
|
||||||
|
| `translated` | Nombre de traductions ajoutées |
|
||||||
|
| `skipped` | Paroles ignorées (pas de source FR ou déjà complètes) |
|
||||||
|
| `errors` | Erreurs DeepL avec `documentId`, `titre` et `lang` |
|
||||||
|
|
||||||
|
**Exemple :**
|
||||||
|
```bash
|
||||||
|
curl -X POST -H "Authorization: Bearer <token>" \
|
||||||
|
"https://api.pawol.nu/api/paroles/bulk-translate"
|
||||||
|
```
|
||||||
|
|
||||||
|
___
|
||||||
|
|
||||||
|
### `/paroles/export` ⚙️ Token requis
|
||||||
|
- `GET` : Exporter les paroles et traductions au format JSONL ou JSON pour l'entraînement de modèles LLM
|
||||||
|
|
||||||
|
**Paramètres de requête :**
|
||||||
|
|
||||||
|
| Paramètre | Valeurs acceptées | Défaut | Description |
|
||||||
|
|-----------|-------------------|--------|-------------|
|
||||||
|
| `type` | `pairs` \| `instruct` | `pairs` | Format des exemples d'entraînement |
|
||||||
|
| `lang` | `fr,en,es,de,it` | toutes | Langues cibles à inclure (séparées par des virgules) |
|
||||||
|
| `format` | `jsonl` \| `json` | `jsonl` | Format de la réponse |
|
||||||
|
|
||||||
|
**Type `pairs`** — corpus parallèle source/cible, adapté aux modèles de traduction :
|
||||||
|
```json
|
||||||
|
{"source_lang":"ka","target_lang":"fr","source":"Mwen ka palé épi ou…","target":"Je suis en train de te parler…","title":"Titre","artists":["Artiste"]}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Type `instruct`** — format instruction/chat, adapté au fine-tuning de modèles d'instruction :
|
||||||
|
```json
|
||||||
|
{"messages":[{"role":"system","content":"Tu es un expert en langue KA…"},{"role":"user","content":"Tradui an fransé :\n\nMwen ka palé épi ou…"},{"role":"assistant","content":"Je suis en train de te parler…"}]}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Format `jsonl`** : la première ligne contient les métadonnées (champ `_metadata: true`). Pour les filtrer :
|
||||||
|
```bash
|
||||||
|
jq 'select(._metadata | not)' export.jsonl
|
||||||
|
```
|
||||||
|
|
||||||
|
**Métadonnées incluses** (`_metadata: true` en JSONL, clé `metadata` en JSON) :
|
||||||
|
|
||||||
|
| Champ | Description |
|
||||||
|
|-------|-------------|
|
||||||
|
| `exported_at` | Horodatage de l'export |
|
||||||
|
| `total_paroles` | Nombre de paroles traitées |
|
||||||
|
| `total_pairs` | Nombre d'exemples d'entraînement générés |
|
||||||
|
| `languages` | Nombre de paires par langue |
|
||||||
|
| `missing_translations` | Paroles avec des traductions manquantes, par langue |
|
||||||
|
| `non_ka_transcriptions` | Paroles dont la transcription est suspectée d'être dans une autre langue (ex. français) |
|
||||||
|
|
||||||
|
**Exemple :**
|
||||||
|
```bash
|
||||||
|
curl -H "Authorization: Bearer <token>" \
|
||||||
|
"https://api.pawol.nu/api/paroles/export?type=instruct&lang=fr,en&format=jsonl" \
|
||||||
|
-o dataset.jsonl
|
||||||
|
```
|
||||||
|
|
||||||
## License
|
## License
|
||||||
|
|
||||||
Copyright (C) 2024 Cédric Famibelle-Pronzola & ORGANISATION KA INTERNATIONALE (OKI)
|
Copyright (C) 2024 Cédric Famibelle-Pronzola & ORGANISATION KA INTERNATIONALE (OKI)
|
||||||
|
|||||||
+4
-4
@@ -5,13 +5,13 @@
|
|||||||
"version": "0.1.0",
|
"version": "0.1.0",
|
||||||
"license": "AGPL-3.0",
|
"license": "AGPL-3.0",
|
||||||
"author": {
|
"author": {
|
||||||
"name": "Cédric Famibelle-Pronzola",
|
"name": "ORGANSATION KA INTERNATIONALE",
|
||||||
"email": "contact@cedric-pronzola.dev",
|
"email": "kontak@o-k-i.net",
|
||||||
"url": "https://cedric-pronzola.dev"
|
"url": "https://o-k-i.net"
|
||||||
},
|
},
|
||||||
"repository": {
|
"repository": {
|
||||||
"type": "git",
|
"type": "git",
|
||||||
"url": "git+https://codeberg.org/OKI/api.pawol.nu.git"
|
"url": "git+https://labola.o-k-i.net/ORGANISATION-KA-INTERNATIONALE/api.pawol.nu"
|
||||||
},
|
},
|
||||||
"scripts": {
|
"scripts": {
|
||||||
"dev": "strapi develop",
|
"dev": "strapi develop",
|
||||||
|
|||||||
+4
-4
@@ -13,14 +13,14 @@ export default {
|
|||||||
locales: ['fr'],
|
locales: ['fr'],
|
||||||
translations: {
|
translations: {
|
||||||
fr: {
|
fr: {
|
||||||
'Auth.form.welcome.subtitle': 'Connectez-vous à votre compte OKI API',
|
'Auth.form.welcome.subtitle': `Connectez-vous à votre compte ${process.env.STRAPI_ADMIN_SITE_NAME || 'OKI'} API`,
|
||||||
'Auth.form.welcome.title': 'Bienvenue sur OKI API !',
|
'Auth.form.welcome.title': `Bienvenue sur ${process.env.STRAPI_ADMIN_SITE_NAME || 'OKI'} API !`,
|
||||||
'LeftMenu.navbrand.title': 'Tableau de bord',
|
'LeftMenu.navbrand.title': 'Tableau de bord',
|
||||||
'LeftMenu.navbrand.workplace': 'Menu',
|
'LeftMenu.navbrand.workplace': 'Menu',
|
||||||
},
|
},
|
||||||
en: {
|
en: {
|
||||||
'Auth.form.welcome.subtitle': 'Log in to your OKI API account',
|
'Auth.form.welcome.subtitle': `Log in to your ${process.env.STRAPI_ADMIN_SITE_NAME || 'OKI'} API account`,
|
||||||
'Auth.form.welcome.title': 'Welcome to OKI API !',
|
'Auth.form.welcome.title': `Welcome to ${process.env.STRAPI_ADMIN_SITE_NAME || 'OKI'} API !`,
|
||||||
'LeftMenu.navbrand.title': 'Dashboard',
|
'LeftMenu.navbrand.title': 'Dashboard',
|
||||||
'LeftMenu.navbrand.workplace': 'Workplace',
|
'LeftMenu.navbrand.workplace': 'Workplace',
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -57,6 +57,15 @@
|
|||||||
},
|
},
|
||||||
"musicBrainzUrl": {
|
"musicBrainzUrl": {
|
||||||
"type": "string"
|
"type": "string"
|
||||||
|
},
|
||||||
|
"isExclusiveArtist": {
|
||||||
|
"type": "boolean",
|
||||||
|
"default": false
|
||||||
|
},
|
||||||
|
"titrePhare": {
|
||||||
|
"type": "relation",
|
||||||
|
"relation": "manyToOne",
|
||||||
|
"target": "api::parole.parole"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -144,6 +144,13 @@ module.exports = {
|
|||||||
let {data} = event.params
|
let {data} = event.params
|
||||||
const {documentId} = data
|
const {documentId} = data
|
||||||
|
|
||||||
|
if (data.isNewRelease === true) {
|
||||||
|
await strapi.db.query('api::parole.parole').updateMany({
|
||||||
|
where: { isNewRelease: true },
|
||||||
|
data: { isNewRelease: false },
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
const previousParoles = await strapi.db.query('api::parole.parole').findOne({
|
const previousParoles = await strapi.db.query('api::parole.parole').findOne({
|
||||||
where: {documentId},
|
where: {documentId},
|
||||||
populate: {difference: true, artistes: true}
|
populate: {difference: true, artistes: true}
|
||||||
@@ -186,7 +193,7 @@ module.exports = {
|
|||||||
strapi.plugins['email'].services.email.send({
|
strapi.plugins['email'].services.email.send({
|
||||||
from: process.env.SMTP_FROM,
|
from: process.env.SMTP_FROM,
|
||||||
to: previousData.user.email,
|
to: previousData.user.email,
|
||||||
subject: `Publication de "${previousData.titre}" sur pawol.nu`,
|
subject: `Publication de "${previousData.titre}" sur ${(process.env.WEBSITE_URL || 'https://pawol.nu').replace(/^https?:\/\//, '')}`,
|
||||||
text: `Le titre que vous avez soumis, "${previousData.titre}" a été publié sur le site.
|
text: `Le titre que vous avez soumis, "${previousData.titre}" a été publié sur le site.
|
||||||
Vous pouvez le trouver à l'adresse ${process.env.WEBSITE_URL}/paroles/${previousData.slug}
|
Vous pouvez le trouver à l'adresse ${process.env.WEBSITE_URL}/paroles/${previousData.slug}
|
||||||
Merci pour votre contribution ❤️`,
|
Merci pour votre contribution ❤️`,
|
||||||
@@ -199,7 +206,7 @@ module.exports = {
|
|||||||
strapi.plugins['email'].services.email.send({
|
strapi.plugins['email'].services.email.send({
|
||||||
from: process.env.SMTP_FROM,
|
from: process.env.SMTP_FROM,
|
||||||
to: previousData.userAdmin.email,
|
to: previousData.userAdmin.email,
|
||||||
subject: `Publication de "${previousData.titre}" sur pawol.nu`,
|
subject: `Publication de "${previousData.titre}" sur ${(process.env.WEBSITE_URL || 'https://pawol.nu').replace(/^https?:\/\//, '')}`,
|
||||||
text: `Le titre que vous avez soumis, "${previousData.titre}" a été publié sur le site.
|
text: `Le titre que vous avez soumis, "${previousData.titre}" a été publié sur le site.
|
||||||
Vous pouvez le trouver à l'adresse ${process.env.WEBSITE_URL}/paroles/${previousData.slug}.
|
Vous pouvez le trouver à l'adresse ${process.env.WEBSITE_URL}/paroles/${previousData.slug}.
|
||||||
Merci pour votre contribution ❤️`,
|
Merci pour votre contribution ❤️`,
|
||||||
|
|||||||
@@ -65,18 +65,18 @@
|
|||||||
},
|
},
|
||||||
"traductions": {
|
"traductions": {
|
||||||
"type": "component",
|
"type": "component",
|
||||||
"repeatable": false,
|
"component": "trad.traductions",
|
||||||
"component": "trad.traductions"
|
"repeatable": false
|
||||||
},
|
},
|
||||||
"streamVideo": {
|
"streamVideo": {
|
||||||
"type": "component",
|
"type": "component",
|
||||||
"repeatable": true,
|
"component": "url.liens",
|
||||||
"component": "url.liens"
|
"repeatable": true
|
||||||
},
|
},
|
||||||
"streamAudio": {
|
"streamAudio": {
|
||||||
"type": "component",
|
"type": "component",
|
||||||
"repeatable": true,
|
"component": "store.album",
|
||||||
"component": "store.album"
|
"repeatable": true
|
||||||
},
|
},
|
||||||
"commentaires": {
|
"commentaires": {
|
||||||
"type": "relation",
|
"type": "relation",
|
||||||
@@ -88,8 +88,8 @@
|
|||||||
},
|
},
|
||||||
"difference": {
|
"difference": {
|
||||||
"type": "component",
|
"type": "component",
|
||||||
"repeatable": true,
|
"component": "difference.paroles-diff",
|
||||||
"component": "difference.paroles-diff"
|
"repeatable": true
|
||||||
},
|
},
|
||||||
"gadeEmbed": {
|
"gadeEmbed": {
|
||||||
"type": "string"
|
"type": "string"
|
||||||
@@ -118,6 +118,40 @@
|
|||||||
"videos",
|
"videos",
|
||||||
"files"
|
"files"
|
||||||
]
|
]
|
||||||
|
},
|
||||||
|
"pawol": {
|
||||||
|
"type": "media",
|
||||||
|
"multiple": false,
|
||||||
|
"allowedTypes": [
|
||||||
|
"files"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"langueSource": {
|
||||||
|
"type": "enumeration",
|
||||||
|
"enum": ["ka", "fr", "en", "es", "de", "it"],
|
||||||
|
"default": "ka"
|
||||||
|
},
|
||||||
|
"sourceOriginale": {
|
||||||
|
"type": "relation",
|
||||||
|
"relation": "manyToOne",
|
||||||
|
"target": "api::parole.parole",
|
||||||
|
"inversedBy": "remixes"
|
||||||
|
},
|
||||||
|
"remixes": {
|
||||||
|
"type": "relation",
|
||||||
|
"relation": "oneToMany",
|
||||||
|
"target": "api::parole.parole",
|
||||||
|
"mappedBy": "sourceOriginale"
|
||||||
|
},
|
||||||
|
"isNewRelease": {
|
||||||
|
"type": "boolean",
|
||||||
|
"default": false
|
||||||
|
},
|
||||||
|
"karaokeUrl": {
|
||||||
|
"type": "string"
|
||||||
|
},
|
||||||
|
"karaokeDesktopUrl": {
|
||||||
|
"type": "string"
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -2,7 +2,48 @@
|
|||||||
|
|
||||||
const { createCoreController } = require('@strapi/strapi').factories;
|
const { createCoreController } = require('@strapi/strapi').factories;
|
||||||
|
|
||||||
|
const VALID_LANGS = new Set(['fr', 'en', 'es', 'de', 'it'])
|
||||||
|
|
||||||
module.exports = createCoreController('api::parole.parole', ({strapi}) => ({
|
module.exports = createCoreController('api::parole.parole', ({strapi}) => ({
|
||||||
|
async export(ctx) {
|
||||||
|
const { type = 'pairs', lang, format = 'jsonl' } = ctx.query
|
||||||
|
|
||||||
|
const langs = lang
|
||||||
|
? lang.split(',').map(l => l.trim()).filter(l => VALID_LANGS.has(l))
|
||||||
|
: null
|
||||||
|
|
||||||
|
if (lang && (!langs || langs.length === 0)) {
|
||||||
|
return ctx.badRequest('Langue(s) invalide(s). Valeurs acceptées : fr, en, es, de, it.')
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!['pairs', 'instruct'].includes(type)) {
|
||||||
|
return ctx.badRequest('type invalide. Valeurs acceptées : pairs, instruct.')
|
||||||
|
}
|
||||||
|
|
||||||
|
const paroles = await strapi.service('api::parole.parole').fetchAllParoles()
|
||||||
|
const { metadata, pairs } = strapi.service('api::parole.parole').buildExport(paroles, type, langs)
|
||||||
|
|
||||||
|
if (format === 'json') {
|
||||||
|
return ctx.send({ metadata, data: pairs })
|
||||||
|
}
|
||||||
|
|
||||||
|
// JSONL : première ligne = métadonnées, suivies des exemples d'entraînement.
|
||||||
|
// Pour filtrer la ligne de métadonnées : jq 'select(._metadata | not)'
|
||||||
|
const lines = [
|
||||||
|
JSON.stringify({ _metadata: true, ...metadata }),
|
||||||
|
...pairs.map(p => JSON.stringify(p)),
|
||||||
|
]
|
||||||
|
|
||||||
|
ctx.set('Content-Type', 'application/x-ndjson')
|
||||||
|
ctx.set('Content-Disposition', `attachment; filename="pawol-nu-export-${Date.now()}.jsonl"`)
|
||||||
|
ctx.body = lines.join('\n')
|
||||||
|
},
|
||||||
|
|
||||||
|
async bulkTranslate(ctx) {
|
||||||
|
const result = await strapi.service('api::parole.parole').bulkTranslateMissing()
|
||||||
|
return ctx.send(result)
|
||||||
|
},
|
||||||
|
|
||||||
async findOne(documentId) {
|
async findOne(documentId) {
|
||||||
const parole = await strapi.documents('api::parole.parole').findOne({
|
const parole = await strapi.documents('api::parole.parole').findOne({
|
||||||
documentId,
|
documentId,
|
||||||
|
|||||||
@@ -0,0 +1,18 @@
|
|||||||
|
'use strict';
|
||||||
|
|
||||||
|
module.exports = {
|
||||||
|
routes: [
|
||||||
|
{
|
||||||
|
method: 'GET',
|
||||||
|
path: '/paroles/export',
|
||||||
|
handler: 'parole.export',
|
||||||
|
config: { policies: [], middlewares: [] },
|
||||||
|
},
|
||||||
|
{
|
||||||
|
method: 'POST',
|
||||||
|
path: '/paroles/bulk-translate',
|
||||||
|
handler: 'parole.bulkTranslate',
|
||||||
|
config: { policies: [], middlewares: [] },
|
||||||
|
},
|
||||||
|
],
|
||||||
|
};
|
||||||
@@ -6,6 +6,44 @@ const Diff = require('diff')
|
|||||||
const { createCoreService } = require('@strapi/strapi').factories;
|
const { createCoreService } = require('@strapi/strapi').factories;
|
||||||
const { ApplicationError } = require("@strapi/utils").errors
|
const { ApplicationError } = require("@strapi/utils").errors
|
||||||
|
|
||||||
|
const LANG_MAP = {
|
||||||
|
fr: { field: 'francais', targetLang: 'fr', userPrompt: 'Tradui an fransé' },
|
||||||
|
en: { field: 'anglais', targetLang: 'en', userPrompt: 'Translate to English' },
|
||||||
|
es: { field: 'espagnol', targetLang: 'es', userPrompt: 'Traduce al español' },
|
||||||
|
de: { field: 'allemand', targetLang: 'de', userPrompt: 'Übersetze auf Deutsch' },
|
||||||
|
it: { field: 'italien', targetLang: 'it', userPrompt: 'Traduci in italiano' },
|
||||||
|
}
|
||||||
|
|
||||||
|
const ALL_LANGS = Object.keys(LANG_MAP)
|
||||||
|
|
||||||
|
function stripMarkdown(text) {
|
||||||
|
if (!text) return ''
|
||||||
|
return text
|
||||||
|
.replace(/#{1,6}\s+/g, '')
|
||||||
|
.replace(/\*\*(.*?)\*\*/gs, '$1')
|
||||||
|
.replace(/\*(.*?)\*/gs, '$1')
|
||||||
|
.replace(/__(.*?)__/gs, '$1')
|
||||||
|
.replace(/_(.*?)_/gs, '$1')
|
||||||
|
.replace(/\[([^\]]+)\]\([^\)]+\)/g, '$1')
|
||||||
|
.replace(/^[>\-\*\+]\s+/gm, '')
|
||||||
|
.replace(/\n{3,}/g, '\n\n')
|
||||||
|
.trim()
|
||||||
|
}
|
||||||
|
|
||||||
|
// Détecte si une transcription est probablement en français plutôt qu'en KA.
|
||||||
|
// Heuristique : si les pronoms personnels français représentent > 4 % des mots.
|
||||||
|
const FR_PRONOUNS = new Set(['je', 'tu', 'il', 'elle', 'nous', 'vous', 'ils', 'elles'])
|
||||||
|
|
||||||
|
function suspectFrench(text) {
|
||||||
|
if (!text) return false
|
||||||
|
const words = text.toLowerCase().match(/\b[a-zàâäéèêëîïôöùûüç]+\b/g) || []
|
||||||
|
if (words.length < 10) return false
|
||||||
|
const frCount = words.filter(w => FR_PRONOUNS.has(w)).length
|
||||||
|
return frCount / words.length > 0.04
|
||||||
|
}
|
||||||
|
|
||||||
|
const sleep = ms => new Promise(resolve => setTimeout(resolve, ms))
|
||||||
|
|
||||||
class Translator {
|
class Translator {
|
||||||
constructor() {
|
constructor() {
|
||||||
this.deeplApi = process.env.DEEPL_URL || 'api-free.deepl.com'
|
this.deeplApi = process.env.DEEPL_URL || 'api-free.deepl.com'
|
||||||
@@ -72,6 +110,165 @@ module.exports = createCoreService('api::parole.parole', ({strapi}) => ({
|
|||||||
throw new ApplicationError('La transcription doit contenir au moins 10 caractères.')
|
throw new ApplicationError('La transcription doit contenir au moins 10 caractères.')
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
|
async fetchAllParoles() {
|
||||||
|
const pageSize = 100
|
||||||
|
let start = 0
|
||||||
|
const all = []
|
||||||
|
|
||||||
|
while (true) {
|
||||||
|
const batch = await strapi.documents('api::parole.parole').findMany({
|
||||||
|
status: 'published',
|
||||||
|
populate: ['artistes', 'traductions'],
|
||||||
|
fields: ['documentId', 'titre', 'slug', 'transcription', 'annee', 'langueSource'],
|
||||||
|
limit: pageSize,
|
||||||
|
start,
|
||||||
|
})
|
||||||
|
all.push(...batch)
|
||||||
|
if (batch.length < pageSize) break
|
||||||
|
start += pageSize
|
||||||
|
}
|
||||||
|
|
||||||
|
return all
|
||||||
|
},
|
||||||
|
|
||||||
|
buildExport(paroles, type, langs) {
|
||||||
|
const targetLangs = langs && langs.length ? langs : ALL_LANGS
|
||||||
|
const pairs = []
|
||||||
|
const missing = []
|
||||||
|
const nonKa = []
|
||||||
|
const langCounts = {}
|
||||||
|
|
||||||
|
for (const parole of paroles) {
|
||||||
|
const source = stripMarkdown(parole.transcription)
|
||||||
|
const sourceLang = parole.langueSource || 'ka'
|
||||||
|
const artists = (parole.artistes || []).map(a => a.alias)
|
||||||
|
const paroleMeta = { title: parole.titre, artists }
|
||||||
|
|
||||||
|
if (sourceLang !== 'ka') {
|
||||||
|
nonKa.push({ documentId: parole.documentId, slug: parole.slug, ...paroleMeta, suspected_lang: sourceLang })
|
||||||
|
} else if (suspectFrench(source)) {
|
||||||
|
nonKa.push({ documentId: parole.documentId, slug: parole.slug, ...paroleMeta, suspected_lang: 'fr' })
|
||||||
|
}
|
||||||
|
|
||||||
|
const missingLangs = ALL_LANGS.filter(lang => !parole.traductions?.[LANG_MAP[lang].field])
|
||||||
|
if (missingLangs.length > 0) {
|
||||||
|
missing.push({ documentId: parole.documentId, slug: parole.slug, ...paroleMeta, missing: missingLangs })
|
||||||
|
}
|
||||||
|
|
||||||
|
for (const lang of targetLangs) {
|
||||||
|
const { field, targetLang, userPrompt } = LANG_MAP[lang]
|
||||||
|
if (lang === sourceLang) continue
|
||||||
|
const target = stripMarkdown(parole.traductions?.[field])
|
||||||
|
if (!target) continue
|
||||||
|
|
||||||
|
langCounts[lang] = (langCounts[lang] || 0) + 1
|
||||||
|
|
||||||
|
if (type === 'instruct') {
|
||||||
|
const systemPrompt = sourceLang === 'ka'
|
||||||
|
? 'Tu es un expert en langue KA (créole guadeloupéen/martiniquais). Traduis le texte KA suivant.'
|
||||||
|
: `Tu es un expert en traduction. Traduis le texte suivant (langue source : ${sourceLang}).`
|
||||||
|
pairs.push({
|
||||||
|
messages: [
|
||||||
|
{ role: 'system', content: systemPrompt },
|
||||||
|
{ role: 'user', content: `${userPrompt} :\n\n${source}` },
|
||||||
|
{ role: 'assistant', content: target },
|
||||||
|
],
|
||||||
|
})
|
||||||
|
} else {
|
||||||
|
pairs.push({
|
||||||
|
source_lang: sourceLang,
|
||||||
|
target_lang: targetLang,
|
||||||
|
source,
|
||||||
|
target,
|
||||||
|
...paroleMeta,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
const metadata = {
|
||||||
|
exported_at: new Date().toISOString(),
|
||||||
|
total_paroles: paroles.length,
|
||||||
|
total_pairs: pairs.length,
|
||||||
|
languages: langCounts,
|
||||||
|
missing_translations: missing,
|
||||||
|
non_ka_transcriptions: nonKa,
|
||||||
|
}
|
||||||
|
|
||||||
|
return { metadata, pairs }
|
||||||
|
},
|
||||||
|
|
||||||
|
async bulkTranslateMissing() {
|
||||||
|
const TARGET_LANGS = [
|
||||||
|
{ lang: 'en', field: 'anglais', deeplTarget: 'EN', suffix: '\n\n(Translated by DeepL)' },
|
||||||
|
{ lang: 'es', field: 'espagnol', deeplTarget: 'ES', suffix: '\n\n(Traducido por DeepL)' },
|
||||||
|
{ lang: 'de', field: 'allemand', deeplTarget: 'DE', suffix: '\n\n(Übersetzt von DeepL)' },
|
||||||
|
{ lang: 'it', field: 'italien', deeplTarget: 'IT', suffix: '\n\n(Tradotto da DeepL)' },
|
||||||
|
]
|
||||||
|
|
||||||
|
const pageSize = 100
|
||||||
|
let start = 0
|
||||||
|
const all = []
|
||||||
|
while (true) {
|
||||||
|
const batch = await strapi.documents('api::parole.parole').findMany({
|
||||||
|
status: 'published',
|
||||||
|
populate: ['traductions'],
|
||||||
|
fields: ['documentId', 'slug', 'titre', 'transcription', 'langueSource'],
|
||||||
|
limit: pageSize,
|
||||||
|
start,
|
||||||
|
})
|
||||||
|
all.push(...batch)
|
||||||
|
if (batch.length < pageSize) break
|
||||||
|
start += pageSize
|
||||||
|
}
|
||||||
|
|
||||||
|
const translator = new Translator()
|
||||||
|
const translated = []
|
||||||
|
const skipped = []
|
||||||
|
const errors = []
|
||||||
|
|
||||||
|
for (const parole of all) {
|
||||||
|
const sourceFR = parole.traductions?.francais
|
||||||
|
|| (parole.langueSource === 'fr' ? parole.transcription : null)
|
||||||
|
|
||||||
|
if (!sourceFR) { skipped.push(parole.slug); continue }
|
||||||
|
|
||||||
|
const missing = TARGET_LANGS.filter(({ field }) => !parole.traductions?.[field])
|
||||||
|
if (missing.length === 0) { skipped.push(parole.slug); continue }
|
||||||
|
|
||||||
|
const { id: _id, ...tradData } = parole.traductions || {}
|
||||||
|
const updatedTrad = { ...tradData }
|
||||||
|
const addedLangs = []
|
||||||
|
|
||||||
|
for (const { lang, field, deeplTarget, suffix } of missing) {
|
||||||
|
try {
|
||||||
|
await sleep(700)
|
||||||
|
const result = await translator.get('FR', deeplTarget, sourceFR)
|
||||||
|
const text = result?.translations?.[0]?.text
|
||||||
|
if (text) {
|
||||||
|
updatedTrad[field] = text + suffix
|
||||||
|
addedLangs.push(lang)
|
||||||
|
}
|
||||||
|
} catch (err) {
|
||||||
|
errors.push({ slug: parole.slug, lang: deeplTarget, error: err.message })
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (addedLangs.length > 0) {
|
||||||
|
await strapi.documents('api::parole.parole').update({
|
||||||
|
documentId: parole.documentId,
|
||||||
|
data: { traductions: updatedTrad },
|
||||||
|
})
|
||||||
|
await strapi.documents('api::parole.parole').publish({
|
||||||
|
documentId: parole.documentId,
|
||||||
|
})
|
||||||
|
translated.push({ slug: parole.slug, langs: addedLangs })
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return { translated, skipped, errors }
|
||||||
|
},
|
||||||
|
|
||||||
parolesDiff(titre = '', oldString, newString) {
|
parolesDiff(titre = '', oldString, newString) {
|
||||||
const patch = Diff.createPatch(titre, oldString, newString, 'supprimée', 'ajoutée')
|
const patch = Diff.createPatch(titre, oldString, newString, 'supprimée', 'ajoutée')
|
||||||
const parsePatch = Diff.parsePatch(patch)
|
const parsePatch = Diff.parsePatch(patch)
|
||||||
|
|||||||
Vendored
+16
@@ -448,6 +448,8 @@ export interface ApiArtisteArtiste extends Struct.CollectionTypeSchema {
|
|||||||
createdBy: Schema.Attribute.Relation<'oneToOne', 'admin::user'> &
|
createdBy: Schema.Attribute.Relation<'oneToOne', 'admin::user'> &
|
||||||
Schema.Attribute.Private;
|
Schema.Attribute.Private;
|
||||||
dateNaissance: Schema.Attribute.Date;
|
dateNaissance: Schema.Attribute.Date;
|
||||||
|
isExclusiveArtist: Schema.Attribute.Boolean &
|
||||||
|
Schema.Attribute.DefaultTo<false>;
|
||||||
locale: Schema.Attribute.String & Schema.Attribute.Private;
|
locale: Schema.Attribute.String & Schema.Attribute.Private;
|
||||||
localizations: Schema.Attribute.Relation<
|
localizations: Schema.Attribute.Relation<
|
||||||
'oneToMany',
|
'oneToMany',
|
||||||
@@ -461,6 +463,7 @@ export interface ApiArtisteArtiste extends Struct.CollectionTypeSchema {
|
|||||||
prenom: Schema.Attribute.String;
|
prenom: Schema.Attribute.String;
|
||||||
publishedAt: Schema.Attribute.DateTime;
|
publishedAt: Schema.Attribute.DateTime;
|
||||||
slug: Schema.Attribute.String;
|
slug: Schema.Attribute.String;
|
||||||
|
titrePhare: Schema.Attribute.Relation<'manyToOne', 'api::parole.parole'>;
|
||||||
updatedAt: Schema.Attribute.DateTime;
|
updatedAt: Schema.Attribute.DateTime;
|
||||||
updatedBy: Schema.Attribute.Relation<'oneToOne', 'admin::user'> &
|
updatedBy: Schema.Attribute.Relation<'oneToOne', 'admin::user'> &
|
||||||
Schema.Attribute.Private;
|
Schema.Attribute.Private;
|
||||||
@@ -540,6 +543,13 @@ export interface ApiParoleParole extends Struct.CollectionTypeSchema {
|
|||||||
>;
|
>;
|
||||||
forceSlug: Schema.Attribute.Boolean;
|
forceSlug: Schema.Attribute.Boolean;
|
||||||
gadeEmbed: Schema.Attribute.String;
|
gadeEmbed: Schema.Attribute.String;
|
||||||
|
isNewRelease: Schema.Attribute.Boolean & Schema.Attribute.DefaultTo<false>;
|
||||||
|
karaokeDesktopUrl: Schema.Attribute.String;
|
||||||
|
karaokeUrl: Schema.Attribute.String;
|
||||||
|
langueSource: Schema.Attribute.Enumeration<
|
||||||
|
['ka', 'fr', 'en', 'es', 'de', 'it']
|
||||||
|
> &
|
||||||
|
Schema.Attribute.DefaultTo<'ka'>;
|
||||||
locale: Schema.Attribute.String & Schema.Attribute.Private;
|
locale: Schema.Attribute.String & Schema.Attribute.Private;
|
||||||
localizations: Schema.Attribute.Relation<
|
localizations: Schema.Attribute.Relation<
|
||||||
'oneToMany',
|
'oneToMany',
|
||||||
@@ -548,9 +558,15 @@ export interface ApiParoleParole extends Struct.CollectionTypeSchema {
|
|||||||
Schema.Attribute.Private;
|
Schema.Attribute.Private;
|
||||||
musicBrainzUrl: Schema.Attribute.String;
|
musicBrainzUrl: Schema.Attribute.String;
|
||||||
okiMizikID: Schema.Attribute.Integer;
|
okiMizikID: Schema.Attribute.Integer;
|
||||||
|
pawol: Schema.Attribute.Media<'files'>;
|
||||||
prioriteArtistes: Schema.Attribute.String;
|
prioriteArtistes: Schema.Attribute.String;
|
||||||
publishedAt: Schema.Attribute.DateTime;
|
publishedAt: Schema.Attribute.DateTime;
|
||||||
|
remixes: Schema.Attribute.Relation<'oneToMany', 'api::parole.parole'>;
|
||||||
slug: Schema.Attribute.String & Schema.Attribute.Unique;
|
slug: Schema.Attribute.String & Schema.Attribute.Unique;
|
||||||
|
sourceOriginale: Schema.Attribute.Relation<
|
||||||
|
'manyToOne',
|
||||||
|
'api::parole.parole'
|
||||||
|
>;
|
||||||
streamAudio: Schema.Attribute.Component<'store.album', true>;
|
streamAudio: Schema.Attribute.Component<'store.album', true>;
|
||||||
streamVideo: Schema.Attribute.Component<'url.liens', true>;
|
streamVideo: Schema.Attribute.Component<'url.liens', true>;
|
||||||
titre: Schema.Attribute.String & Schema.Attribute.Required;
|
titre: Schema.Attribute.String & Schema.Attribute.Required;
|
||||||
|
|||||||
Reference in New Issue
Block a user