vault backup: 2023-11-08 18:16:49

2023-11-08 18:16:50 +08:00
parent 7641b9750f
commit df1082fbf3
4 changed files with 24 additions and 5 deletions
--- a/02-Note/DAWA/AI偶像陪伴项目/AIVirtualIdel动画方案.md
+++ b/02-Note/DAWA/AI偶像陪伴项目/AIVirtualIdel动画方案.md
@@ -0,0 +1,408 @@
+# 动画方案
+预制开始/等待动画 -> VMC推流动画 -> 预制结束/等待动画
+## VMC推流
+[[AnimNode & VMC笔记]]
+
+
+## 迭代动画状态机方案
+1. 由ChatGPT模型AI使用之前录制动画素材拼凑出N组排列组合。
+2. 动画资产以及排列数据进行定期热更新。（自动 | 人工）
+3. 实时直播时由ChatGPT发送指定排列组合的名称或者ID给客户端，之后客户端播放对应的排列组合动画。
+
+# 推流方案
+推流视频：
+- https://www.bilibili.com/video/BV1ub4y1Y74K/?spm_id_from=333.337.search-card.all.click&vd_source=d47c0bb42f9c72fd7d74562185cee290
+- https://www.youtube.com/watch?v=ufU9me5pDYE&t=2s
+
+# 协议
+## OSC
+一种基于UDP的**远程控制协议**，传输的数据主要分为Bundle 与 Message。
+- OSC:https://opensoundcontrol.stanford.edu/index.html
+- Nodejs的OSC实现:https://www.npmjs.com/package/osc
+	- 案例代码库:https://github.com/colinbdclark/osc.js-examples 
+
+### 反序列化步骤
+1. 调用`ReadOSC()`
+2. ReadOSCString，读取Address。主要分为`#bundle`、`#message`。
+3. `#bundle`
+	1. 读取uint64 Time。
+	2. 调用`ReadOSC()`，递归序列化之后的数据。
+4. `#message`:基础数据反序列化逻辑
+	1. 读取FString  Semantics，里面每个字符代表之后基础数据的类型。
+	2. 根据基础数据类型进行反序列化，一个数据生成一个FUEOSCElement。
+
+```c++
+UENUM()  
+enum class EUEOSCElementType : uint8  
+{  
+	OET_Int32 UMETA(DisplayName = "Int32"),  
+	OET_Float UMETA(DisplayName = "Float"),  
+	OET_String UMETA(DisplayName = "String"),  
+	OET_Blob UMETA(DisplayName = "Blob"),  
+	OET_Bool UMETA(DisplayName = "Bool"),  
+	OET_Nil UMETA(DisplayName = "Nil")  
+};
+```
+
+### UEOSC实现分析
+一个数据包的格式为：
+```c++
+USTRUCT(BlueprintType)  
+struct UEOSC_API FUEOSCMessage  
+{  
+	GENERATED_USTRUCT_BODY()  
+	  
+	public:  
+	FString Address;  
+	TArray<FUEOSCElement> Elements;  
+};
+```
+其中`FUEOSCElement`存储的具体数据，里面是一个结构体存储着基础数据类型数据以及数据类型枚举。
+
+#### 基础数据类型
+数据以结构体形式进行序列化/反系列化。可携带的数据类型为：
+- int32、int64、uint64
+- float32
+- String(`FName`)
+- blob(`TArray<uint8>`)
+- bool
+
+## VMC
+全名为Virtual Motion Capture Protocol，一种基于OSC的虚拟偶像动作输出传输协议。 
+
+**存在问题**：
+1. 数据没有压缩，不适合互联网传输。
+2. 基于OSC这种UDP协议数据没有可靠性。
+
+### 协议分析
+- VMC协议可视化工具:https://github.com/gpsnmeajp/VMCProtocolMonitor
+- https://protocol.vmc.info/specification
+	- performer-spec:https://protocol.vmc.info/performer-spec
+	- marionette-spec:https://protocol.vmc.info/marionette-spec
+
+VMC协议基本上实现了开放声音控制（OSC）单向UDP通信来进行通信。
+
+因此，VMC 协议定义了自己的术语：
+- **木偶**(**Marionette**)
+	- 接收动作、绘制等。（必填）  
+	- 它的存在最终是为了在屏幕、视频、通讯上产生结果。  
+    （示例：EVMC4U、VMC4UE、其他运动接收兼容应用程序）
+- **表演者**(**performer**)
+	- 主要处理运动并向 Marionette 发送**全身骨骼信息 (IK)** 和**辅助信息**。（必填）  
+	- （例如虚拟动作捕捉、Waidayo、VSeeFace、MocapForAll、TDPT 等）
+- **助理**(**Assistant**)
+	- 主要不处理动作并向表演者发送辅助信息。（可选）  
+	- 仅负责**发送辅助**信息。（**一些骨骼、跟踪器姿势、面部表情**等）  
+    （例如，face2vmc 模式下的 Waidayo、Sknuckle、Simple Motion Tracker、Uni-studio 等）
+
+具体沟通规定如下：
+- 通信时使用适当类型的 OSC。
+- 字符串采用 UTF-8 编码，可以用日语发送。
+- 至于端口号，Marionette 将监听端口：39539，而 Performer 将监听端口：39540，但从 UX 角度来看，我们建议您更改发送地址  
+    和接收端口。
+- 数据包在适当的范围内（1500 字节以内）进行捆绑，并且应由接收方进行适当的处理。
+- 传输周期以发送方的任意间隔执行。并非所有消息都会在每个周期发送。  
+    另外，发送方应该能够调整发送周期的间隔，或者以足够低的频率发送。
+- 接收方应丢弃不必要的消息。您不必处理所有消息。
+- 发送或接收哪些消息取决于两者的实现。
+- 未知地址，应忽略太多参数。
+- 如果您发现参数太少或类型与扩展规范中定义的参数不同，请将它们视为错误或忽略它们。
+
+![800](https://protocol.vmc.info/flow.gif)
+![500](https://protocol.vmc.info/layer.png)
+
+### performer-spec
+这是`Marionette→Performer`或`Assistent→Performer`流程中的发送数据的规范。
+
+主要有：
+- 虚拟设备转换
+- 帧周期
+- 虚拟 MIDI CC 值输入
+- 虚拟摄像机变换和 FOV
+- VRM BlendShapeProxyValue
+- 眼动追踪目标位置
+- [事件发送]信息发送请求(Request Information)
+- [事件传输] 响应字符串
+- [事件发送] 校准（准备）请求（校准/校准就绪请求）
+- [事件发送]请求加载设置文件
+- 通过信息
+- DirectionalLight 位置/颜色（DirectionalLight 变换和颜色）
+- [事件传输] 快捷调用（Call Shortcut）
+
+#### 虚拟设备转换
+主要为：虚拟头显、控制器Controller和追踪器Track。(HMD被视为跟踪器)
+结构为：虚拟序列号 ->  Position ->Quaternion  
+```json
+V2.3
+/VMC/Ext/Hmd/Pos (string){serial} (float){p.x} (float){p.y} (float){p.z} (float){q.x} (float){q.y} (float){q.z} (float){q.w}  
+/VMC/Ext/Con/Pos (string){serial} (float){p.x} (float){p.y} (float){p.z} (float){q.x} (float){q.y} (float){q.z} (float){q.w}  
+/VMC/Ext/Tra/Pos (string){serial} (float){p.x} (float){p.y} (float){p.z} (float){q.x} (float){q.y} (float){q.z} (float){q.w}  
+```
+
+#### 帧周期
+设置虚拟动作捕捉的数据传输间隔。  以 1/x 帧间隔发送。
+```json
+V2.3
+/VMC/Ext/Set/Period (int){Status} (int){Root} (int){Bone} (int){BlendShape} (int){Camera} (int){Devices} 
+```
+
+### marionette-spec
+这是`Performer → Marionette`流程中的发送数据的规范。
+-  内容：基础状态描述，校准状态、校准模式、追踪状态
+- 发送者相对时间（Time）
+- 根变换
+- 骨骼变换
+- VRM BlendShapeProxyValue
+- 相机位置/FOV（相机变换和FOV）
+- 控制器输入
+- [事件传输]键盘输入
+- [事件传输] MIDI 音符输入
+- [事件传输] MIDI CC 值输入
+- [事件传输] MIDI CC 按钮输入
+- 设备改造
+- [低频] 接收使能
+- [低频] DirectionalLight 位置/颜色（DirectionalLight 变换和颜色）
+- [低频]本地VRM信息
+- [低频]远程VRM基本信息
+- [低频] 选项字符串
+- [低频]背景色
+- [低频]窗口属性信息
+- [低频]加载设置路径
+- 通过信息
+
+### 根骨骼变换
+```json
+v2.0
+/VMC/Ext/Root/Pos (string){name} (float){p.x} (float){p.y} (float){p.z} (float){q.x} (float){q.y} (float){q.z} (float){q.w}  
+
+v2.1
+/VMC/Ext/Root/Pos (string){name} (float){p.x} (float){p.y} (float){p.z} (float){q.x} (float){q.y} (float){q.z} (float){q.w} (float){s.x} (float){s.y} (float){s.z} (float){o.x} (float){o.y} (float){o.z}  
+```
+p=位置 q=旋转（四元数） s=MR 合成的比例 o=MR 合成的偏移
+
+作为模型根的对象的绝对姿势  
+名称固定为“root”。 建议将  
+前半部分视为Position，后半部分视为接收侧Loal姿势的四元数（以与Bone匹配）。  
+
+从 v2.1 开始，添加了 MR 合成的比例。  
+通过使用它，可以将虚拟人物的位置和大小调整为实际的身体尺寸。
+
+### 骨骼变换
+```json
+/VMC/Ext/Bone/Pos (string){name} (float){p.x} (float){p.y} (float){p.z} (float){q.x} (float){q.y} (float){q.z} (float){q.w}  
+```
+
+作为模型根的对象的局部姿势名称  
+是UnityEngine沿HumanBodyBones的类型名称  
+前半部分是Position，后半部分是Quaternion
+
+所有 HumanBodyBone 都将被发送。还包括 LastBone。  这还将传输手指运动和眼骨。
+
+## UE Remote Control
+https://docs.unrealengine.com/5.1/en-US/remote-control-for-unreal-engine/
+
+基于WebSocket
+
+# VMC4UE的实现
+## AnimNode
+实现：
+- FAnimNode_ModifyVMC4UEBones
+- FAnimNode_ModifyVMC4UEMorph
+
+![[AnimNode & VMC笔记]]
+
+# VMC APP代码参考
+- [VirtualMotionCaptureProtocol](https://github.com/sh-akira/VirtualMotionCaptureProtocol)提供了最基础的实现。
+- ~~[EasyVirtualMotionCaptureForUnity](https://github.com/gpsnmeajp/EasyVirtualMotionCaptureForUnity)~~
+- ThirdParts
+	- https://github.com/digital-standard/ThreeDPoseTracker
+		- 逻辑在VMCPBonesSender.cs、uOscClientTDP.cs
+		- BufferSize = 8192;int MaxQueueSize = 100;
+
+## VirtualMotionCaptureProtocol
+Message方式：
+```c#
+void Update()
+{
+	//モデルが更新されたときのみ読み込み
+	if (Model != null && OldModel != Model)
+	{
+		animator = Model.GetComponent<Animator>();
+		blendShapeProxy = Model.GetComponent<VRMBlendShapeProxy>();
+		OldModel = Model;
+	}
+
+	if (Model != null && animator != null && uClient != null)
+	{
+		//Root
+		var RootTransform = Model.transform;
+		if (RootTransform != null)
+		{
+			uClient.Send("/VMC/Ext/Root/Pos",
+				"root",
+				RootTransform.position.x, RootTransform.position.y, RootTransform.position.z,
+				RootTransform.rotation.x, RootTransform.rotation.y, RootTransform.rotation.z, RootTransform.rotation.w);
+		}
+
+		//Bones
+		foreach (HumanBodyBones bone in Enum.GetValues(typeof(HumanBodyBones)))
+		{
+			if (bone != HumanBodyBones.LastBone)
+			{
+				var Transform = animator.GetBoneTransform(bone);
+				if (Transform != null)
+				{
+					uClient.Send("/VMC/Ext/Bone/Pos",
+						bone.ToString(),
+						Transform.localPosition.x, Transform.localPosition.y, Transform.localPosition.z,
+						Transform.localRotation.x, Transform.localRotation.y, Transform.localRotation.z, Transform.localRotation.w);
+				}
+			}
+		}
+
+		//ボーン位置を仮想トラッカーとして送信
+		SendBoneTransformForTracker(HumanBodyBones.Head, "Head");
+		SendBoneTransformForTracker(HumanBodyBones.Spine, "Spine");
+		SendBoneTransformForTracker(HumanBodyBones.LeftHand, "LeftHand");
+		SendBoneTransformForTracker(HumanBodyBones.RightHand, "RightHand");
+		SendBoneTransformForTracker(HumanBodyBones.LeftFoot, "LeftFoot");
+		SendBoneTransformForTracker(HumanBodyBones.RightFoot, "RightFoot");
+
+		//BlendShape
+		if (blendShapeProxy != null)
+		{
+			foreach (var b in blendShapeProxy.GetValues())
+			{
+				uClient.Send("/VMC/Ext/Blend/Val",
+					b.Key.ToString(),
+					(float)b.Value
+					);
+			}
+			uClient.Send("/VMC/Ext/Blend/Apply");
+		}
+
+		//Available
+		uClient.Send("/VMC/Ext/OK", 1);
+	}
+	else
+	{
+		uClient.Send("/VMC/Ext/OK", 0);
+	}
+	uClient.Send("/VMC/Ext/T", Time.time);
+
+	//Load request
+	uClient.Send("/VMC/Ext/VRM", filepath, "");
+}
+
+void SendBoneTransformForTracker(HumanBodyBones bone, string DeviceSerial)
+{
+	var DeviceTransform = animator.GetBoneTransform(bone);
+	if (DeviceTransform != null) {
+		uClient.Send("/VMC/Ext/Tra/Pos",
+	(string)DeviceSerial,
+	(float)DeviceTransform.position.x,
+	(float)DeviceTransform.position.y,
+	(float)DeviceTransform.position.z,
+	(float)DeviceTransform.rotation.x,
+	(float)DeviceTransform.rotation.y,
+	(float)DeviceTransform.rotation.z,
+	(float)DeviceTransform.rotation.w);
+	}
+}
+```
+
+Bundle将数据打包成一个Bundle，创建Bundle时会填入一个时间戳。之后
+```c#
+foreach (HumanBodyBones bone in Enum.GetValues(typeof(HumanBodyBones)))
+{
+...
+ boneBundle.Add(new Message("/VMC/Ext/Bone/Pos",
+                            bone.ToString(),
+                            Transform.localPosition.x, Transform.localPosition.y, Transform.localPosition.z,
+                            Transform.localRotation.x, Transform.localRotation.y, Transform.localRotation.z, Transform.localRotation.w));
+...
+}
+```
+Bundle方式：
+```c#
+ void Update()
+    {
+        //Only model updated
+        if (Model != null && OldModel != Model)
+        {
+            animator = Model.GetComponent<Animator>();
+            blendShapeProxy = Model.GetComponent<VRMBlendShapeProxy>();
+            OldModel = Model;
+        }
+
+        if (Model != null && animator != null && uClient != null)
+        {
+            //Root
+            var RootTransform = Model.transform;
+            if (RootTransform != null)
+            {
+                uClient.Send("/VMC/Ext/Root/Pos",
+                    "root",
+                    RootTransform.position.x, RootTransform.position.y, RootTransform.position.z,
+                    RootTransform.rotation.x, RootTransform.rotation.y, RootTransform.rotation.z, RootTransform.rotation.w);
+            }
+
+            //Bones
+            var boneBundle = new Bundle(Timestamp.Now);
+            foreach (HumanBodyBones bone in Enum.GetValues(typeof(HumanBodyBones)))
+            {
+                if (bone != HumanBodyBones.LastBone)
+                {
+                    var Transform = animator.GetBoneTransform(bone);
+                    if (Transform != null)
+                    {
+                        boneBundle.Add(new Message("/VMC/Ext/Bone/Pos",
+                            bone.ToString(),
+                            Transform.localPosition.x, Transform.localPosition.y, Transform.localPosition.z,
+                            Transform.localRotation.x, Transform.localRotation.y, Transform.localRotation.z, Transform.localRotation.w));
+                    }
+                }
+            }
+            uClient.Send(boneBundle);
+
+            //Virtual Tracker send from bone position
+            var trackerBundle = new Bundle(Timestamp.Now);
+            SendBoneTransformForTracker(ref trackerBundle, HumanBodyBones.Head, "Head");
+            SendBoneTransformForTracker(ref trackerBundle, HumanBodyBones.Spine, "Spine");
+            SendBoneTransformForTracker(ref trackerBundle, HumanBodyBones.LeftHand, "LeftHand");
+            SendBoneTransformForTracker(ref trackerBundle, HumanBodyBones.RightHand, "RightHand");
+            SendBoneTransformForTracker(ref trackerBundle, HumanBodyBones.LeftFoot, "LeftFoot");
+            SendBoneTransformForTracker(ref trackerBundle, HumanBodyBones.RightFoot, "RightFoot");
+            uClient.Send(trackerBundle);
+
+            //BlendShape
+            if (blendShapeProxy != null)
+            {
+                var blendShapeBundle = new Bundle(Timestamp.Now);
+
+                foreach (var b in blendShapeProxy.GetValues())
+                {
+                    blendShapeBundle.Add(new Message("/VMC/Ext/Blend/Val",
+                        b.Key.ToString(),
+                        (float)b.Value
+                        ));
+                }
+                blendShapeBundle.Add(new Message("/VMC/Ext/Blend/Apply"));
+                uClient.Send(blendShapeBundle);
+            }
+
+            //Available
+            uClient.Send("/VMC/Ext/OK", 1);
+        }
+        else
+        {
+            uClient.Send("/VMC/Ext/OK", 0);
+        }
+        uClient.Send("/VMC/Ext/T", Time.time);
+
+        //Load request
+        uClient.Send("/VMC/Ext/VRM", vrmfilepath, "");
+
+    }
+
+```
--- a/02-Note/DAWA/AI偶像陪伴项目/AI偶像陪伴项目笔记.md
+++ b/02-Note/DAWA/AI偶像陪伴项目/AI偶像陪伴项目笔记.md
@@ -15,9 +15,18 @@
 	1. 文字 => 声音 => 表情（口型）
 	2. 文字 => 动作

-# 设计思路
-1. 服务器发送的数据都使用一个基于玩家发送对话生成的UUID作为时间戳。
-2. 数据采用Stream式发送。
+# 推流设计思路
+1. 服务器发送的数据都使用一个基于玩家发送对话生成的UUID作为时间戳（或许还需要一个**用户ID**）。
+2. 数据采用Stream式发送。动作与口型使用VMC来传输；音频使用RTMP协议来传输。其中推流模式分为2种实现：
+	1. Stream式实时接受所有数据。
+	2. Stream式缓存预读。
+3. 架构
+	1. 根据上面说得2种模式，在服务端控制时间戳的重置（**数据截断，并且发送给客户端进行截断与时间轴统一操作**）。
+		1.  Puerts的Nodejs负责接收数据以及管理状态机，来管理接受的推流数据以及是否播放。（**相关核心函数写在C++中**）
+	2. AIVirtualIdolServer端接受到动作数据之后，发送给AIVirtualIdol。
+	3. RTMP的音频推流，AI端需要部署RTMP推流器；AIVirtualIdolServer部署RTMP Server；AIVirtualIdol接受Server数据后播放。
+
+UE中的RTMP实现：

 参考RTMP
 - RTMP协议 01 入门:https://www.jianshu.com/p/715f37b1202f
@@ -63,6 +72,14 @@ Message被拆分成一个或多个Chunk，然后在网络上发送
 来源：简书  
 著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。

+
+# 开发计划
+1. C++实现RecivedDataManagerComponent管理音频、口型&动作数据、文字。
+2. Puerts实现数据接收逻辑。
+3. 实现声音播放逻辑。
+4. 实现口型&动作动画节点。
+5. 实现动画状态机结构初版。
+
 # 声音
 ## Runtime Import Sound
 使用`RuntimeAudioImporter`插件，里面的解码使用了第三方库https://github.com/mackron/dr_libs。导入Sound使用了`UImportedSoundWave::PopulateAudioDataFromDecodedInfo()`
--- a/02-Note/DAWA/AI偶像陪伴项目/AI虚拟偶像陪伴项目开发计划与阶段目标.md
+++ b/02-Note/DAWA/AI偶像陪伴项目/AI虚拟偶像陪伴项目开发计划与阶段目标.md
@@ -0,0 +1,56 @@
+## 阶段技术需求 & 实现目标
+### 第零阶段（快速简历可供AI迭代的基础程序）
+1. Express Http服务器。
+	1. 提供静态文件下载服务。
+	2. 建立一个获取所有Uasset文件的url。
+2. UE客户端
+	1. 动画蓝图中建立多个子AnimGraph，使用动态方式挂载。
+
+### 第一阶段（建立高可用、迭代性的基础架构）
+需求功能（优先&难易度排序）：
+1. 资产 & 逻辑脚本热更新逻辑 => Puerts热更新逻辑。
+2. 客户端发送文字信息给服务端。=> 构建一个Http聊天服务器。
+3. 服务端控制虚拟角色行为 => ~~RPC事件同步~~ Http服务器WebSocket连接Puerts间接控制方案。**后续可能需要改成帧同步方案**。
+4. 客户端在动画蓝图中实时混合**新下载**的**动画资产**。
+5. 客户端在动画蓝图中实时混合**实时推流**的**动画数据**，以及播放**AI生成语音**。
+
+#### 技术细节
+- 客户端：
+	- 使用Puerts控制逻辑。
+	- 使用Puerts热更新逻辑 & 资产。
+	- 使用下载Pak，重启后批量读取方案。
+	- 使用ModuleGameFeature框架进行网络缓存的方案。（堡垒之夜目前使用）
+	- 使用Puerts宿主环境（Nodejs）构建Http服务，以此与服务端通信。（优点是不会卡住游戏线程）
+		- IOS得进行测试，是否可以使用这个方案。
+	- * 实现 **动画数据**推流功能。
+- 服务端：客户端同步采用事件同步 + 缓存动画数据的方式实现
+	- Demo期间使用Nodejs进行打底。采用Nodejs + Express搭建，后台管理页面采用VUE3。
+	- 序列化使用 Protobuf
+	- RPC协议：gRPC ?
+	- Http聊天服务器。
+	- 账号权限判断。
+	- Pak文件 / ModuleGameFeature缓存方案所用的文件服务器。
+	- 文件上传功能（语音数据）。
+	- * 实现 **动画数据**推流功能。
+
+使用其他框架？
+- https://github.com/node-pinus/pinus
+
+#### 现阶段问题：
+我需要知道：
+1. 玩家发送文字信息后，虚拟角色是否会发出语音？
+2. AI如何对虚拟角色行为树进行迭代？仅仅是迭代行为树中的某一个行为么？
+3. 如何针对某一个演员的指定行为进行迭代？录制一定的演员表演动作动画数据进行迭代？
+
+### 第二阶段（使用借助AI配合UE动画系统迭代动画效果）
+需求功能（优先&难易度排序）：
+1. 构建一个可以不断热更、优化（AI系统迭代动画资产）的动画框架。
+	1. MotionMarching
+	2. Motion匹配。
+2. 游戏性提升。
+
+# 其他资料
+- Node-Pinus游戏服务器框架:https://github.com/node-pinus/pinus
+	- 案例:https://github.com/node-pinus/pinus/tree/master/examples/simple-example
+	- Pomelo的wiki:https://github.com/NetEase/pomelo/wiki/Home-in-Chinese
+- Nodejs RPC:https://zhuanlan.zhihu.com/p/598460945