当前位置: 首页 > news >正文

做网站临沂网站建设中数据安全研究

做网站临沂,网站建设中数据安全研究,重庆百度竞价开户,南京百度竞价推广公司排名1.Moving Average in Stable Diffusion (SMAEMA) 1.Moving average 2.移动平均值 3.How We Trained Stable Diffusion for Less than $50k (Part 3) Moving Average 在统计学中#xff0c;移动平均是通过创建整个数据集中不同选择的一系列平均值来分析数据点的计算。 …1.Moving Average in Stable Diffusion (SMAEMA) 1.Moving average 2.移动平均值 3.How We Trained Stable Diffusion for Less than $50k (Part 3) Moving Average 在统计学中移动平均是通过创建整个数据集中不同选择的一系列平均值来分析数据点的计算。 给定一数字序列和固定子集大小移动平均值的第一个元素是通过对数字序列的初始固定子集求平均值而获得的。然后通过“前移”的方式修改子集也就是说排除系列的第一个数字并包括子集中的下一个值。 移动平均的理解来自移动平均值 1.1 Simple Moving AverageSMAan unweighted MA 1.2 Exponential Moving Average (EMAa weighted MA) In the context of Stable Diffusion, the Exponential Moving Average (EMA) is a technique used during the training of machine learning models, particularly neural networks, to stabilize and improve the model’s performance. The Exponential Moving Average is a method of averaging that gives more weight to recent data points, making it more responsive to recent changes compared to a simple moving average, which treats all data points equally. 1.2.1 EMA in Stable Diffusion In the context of Stable Diffusion, EMA is applied to the model parameters during training to create a smoothed version of the model. This is particularly useful in machine learning because the training process can be noisy, with the model parameters oscillating as they converge towards an optimal solution. By maintaining an EMA of the model parameters, the training process can benefit from the following: Smoothing: EMA smooths out the parameter updates, reducing the impact of noise and making the training process more stable.Better Generalization: The EMA version of the model often generalizes better on unseen data compared to the model with the raw parameters. This is because EMA tends to favor parameter values that are more consistent over time.Preventing Overfitting: By averaging the parameters over time, EMA can help mitigate overfitting, especially in cases where the model might otherwise converge too quickly to a suboptimal solution. 笔者个人理解 代价函数loss function是关于参数weightbias的函数也就是说一个loss值对应一组参数值loss值表现为震荡也就是说模型参数也在变化。在训练SD时的MSE Loss在梯度下降过程中是上下震荡的对应的模型参数也在震荡可以用EMA取得这些模型参数震荡值的中间值这个模型参数的中间值也就能更好的代表所有时刻模型参数的平均水平让模型获得了更好的泛化能力 Stable Diffusion 2 uses Exponential Moving Averaging (EMA), which maintains an exponential moving average of the weights. At every time step, the EMA model is updated by taking 0.9999 times the current EMA model plus 0.0001 times the new weights after the latest forward and backward pass. By default, the EMA algorithm is applied after every gradient update for the entire training period. However, this can be slow due to the memory operations required to read and write all the weights at every step. 每个时间步都对所有参数进行EMA代价较大因为要在每个时刻读写模型的全部参数 EMA t 0.0001 ⋅ x t 0.9999 ⋅ EMA t − 1 \text{EMA}_t0.0001\cdot x_t0.9999\cdot \text{EMA}_{t-1} EMAt​0.0001⋅xt​0.9999⋅EMAt−1​ 为了使得计算EMA代价减小我们仅仅采取在最后时间段进行EMA计算 To avoid this costly procedure, we start with a key observation: since the old weights are decayed by a factor of 0.9999 at every batch, the early iterations of training only contribute minimally to the final average. This means we only need to take the exponential moving average of the final few steps. Concretely, we train for 1,400,000 batches and only apply EMA for the final 50,000 steps, which is about 3.5% of the training period. The weights from the first 1,350,000 iterations decay away by (0.9999)^50000, so their aggregate contribution would have a weight of less than 1% in the final model. Using this technique, we can avoid adding overhead for 96.5% of training and still achieve a nearly equivalent EMA model. 1.2.2 Implementation in Stable Diffusion During the training of a diffusion model, the EMA of the model’s weights is updated alongside the regular updates. Here’s a typical process: Initialize EMA Weights: At the start of training, initialize the EMA weights to be the same as the model’s initial weights.Update During Training: After each batch update, update the EMA weights using the formula mentioned above. This requires storing a separate set of weights for the EMA.Use for Inference: At the end of the training, use the EMA weights for inference instead of the raw model weights. This is because the EMA weights represent a more stable and potentially better-performing version of the model. 1.2.3 Practical Considerations Choosing α \alpha αThe smoothing factor α \alpha α is a hyperparameter that needs to be chosen carefully. A common practice is to set α \alpha α based on the number of iterations or epochs, such as α 2 N 1 \alpha\frac{2}{N1} αN12​ where N N N is the number of iterationsPerformance Overhead: Maintaining EMA weights requires additional memory and computational overhead, but the benefits in terms of model stability and performance often outweigh these costs. module.py class EMA: # Initializes the EMA object with a smoothing factor (beta) and a step counter (step).def __init__(self, beta):super().__init__()self.beta beta # Smoothing factor for the exponential moving averageself.step 0 # Step counter to keep track of the number of updates # Updates the moving average of the parameters of the EMA model (ma_model) based on the current model (current_model)def update_model_average(self, ma_model, current_model):# Update the moving average (EMA) of model parametersfor current_params, ma_params in zip(current_model.parameters(), ma_model.parameters()):old_weight, up_weight ma_params.data, current_params.data# Update the moving average of the parametersma_params.data self.update_average(old_weight, up_weight) # Computes the exponentially weighted average of the old and new parameters.def update_average(self, old, new):# Compute the updated averageif old is None:return newreturn old * self.beta (1 - self.beta) * new # Either resets the EMA model parameters to match the current model parameters # if the step count is less than step_start_ema, # or updates the EMA model parameters based on the current model parameters. # It increments the step counter after each call.def step_ema(self, ema_model, model, step_start_ema2000):# Update EMA model parameters or reset them based on the step countif self.step step_start_ema:self.reset_parameters(ema_model, model)else:self.update_model_average(ema_model, model)self.step 1 # Increment the step counter # Copies the current models parameters to the EMA model to initialize the EMA model parametersdef reset_parameters(self, ema_model, model):# Initialize EMA model parameters to be the same as the current models parametersema_model.load_state_dict(model.state_dict())train.py def train(args):device args.device # Get the device to run the training onmodel UNET().to(device) # Initialize the model and move it to the devicemodel.train()optimizer optim.AdamW(model.parameters(), lrargs.lr) # set up the optimizer with AdamWmse nn.MSELoss() # Mean Squared Error loss functionlogger SummaryWriter(os.path.join(runs, args.run_name))len_train len(train_loader) # EMA:Exponential Moving Averageema EMA(0.995) # Exponential Moving Average with decay rate 0.995 # At the start of training, initialize the EMA weights to be the same as the model’s initial weights.ema_model copy.deepcopy(model).eval().requires_grad_(False) # Create a copy of the model for EMA, set to eval mode and no gradientsprint(Start into the loop !)for epoch in range(args.epochs):logging.info(fStarting epoch {epoch}:) # log the start of the epochprogress_bar tqdm(train_loader) # progress bar for the dataloaderoptimizer.zero_grad() # Explicitly zero the gradient buffersaccumulation_steps 4# Load all data into a batchfor batch_idx, (images, captions) in enumerate(progress_bar):images images.to(device) # move images to the device# The dataloaer will add a batch size dimension to the tensor, but Ive already added batch size to the VAE# and CLIP input, so were going to remove a batch size and just keep the batch size of the dataloaderimages torch.squeeze(images, dim1)captions captions.to(device) # move caption to the devicetext_embeddings torch.squeeze(captions, dim1) # squeeze batch_sizetimesteps ddpm_sampler.sample_timesteps(images.shape[0]).to(device) # Sample random timestepsnoisy_latent_images, noises ddpm_sampler.add_noise(images, timesteps) # Add noise to the imagestime_embeddings timesteps_to_time_emb(timesteps)# x_t (batch_size, channel, Height/8, Width/8) (bs,4,256/8,256/8)# caption (batch_size, seq_len, dim) (bs, 77, 768)# t (batch_size, channel) (batch_size, 1280)# (bs,320,H/8,W/8)with torch.no_grad():last_decoder_noise model(noisy_latent_images, text_embeddings, time_embeddings)# (bs,4,H/8,W/8)final_output diffusion.final.to(device)predicted_noise final_output(last_decoder_noise).to(device)loss mse(noises, predicted_noise) # Compute the lossloss.backward() # Backpropagate the lossif (batch_idx 1) % accumulation_steps 0: # Wait for several backward passesoptimizer.step() # Now we can do an optimizer stepoptimizer.zero_grad() # Reset gradients to zero # EMA:Exponential Moving Averageema.step_ema(ema_model, model)progress_bar.set_postfix(MSEloss.item()) # Update the progress bar with the loss# log the loss to TensorBoardlogger.add_scalar(MSE, loss.item(), global_stepepoch * len_train batch_idx)# Save the model checkpointos.makedirs(os.path.join(models, args.run_name), exist_okTrue)torch.save(model.state_dict(), os.path.join(models, args.run_name, fstable_diffusion.ckpt))torch.save(optimizer.state_dict(),os.path.join(models, args.run_name, foptim.pt)) # Save the optimizer state
http://www.yayakq.cn/news/5088/

相关文章:

  • 极品wordpress素材教程网站搭建wordpress面板
  • 高端品牌型 营销型网站建设消防设备网站建设
  • 营销型网站建设文章淮南网名
  • 谢岗网站仿做杭州做网站优化
  • 手机网站制作系统WordPress手机横屏显示
  • 熟练做网站需要了解什么黑龙江网站设计
  • 赣州福泰龙网站建设东莞地铁线路图
  • 企业网站建设的请示可以自己做漫画的软件
  • 织梦做的网站 xampp成都网站建设 四川冠辰
  • 郑州那个公司做网站好wordpress菜单下拉菜单
  • 微信分销小程序开发新乡百度网站优化排名
  • 还是网站好备案空壳网站
  • 融资网站建设wordpress 合同
  • 武平县天恒建设投资集团公司网站星际网络泰安网络公司
  • 广州专业做网站公司有哪些电脑优化是什么意思
  • 四川建设网站公司wordpress是不是做网页
  • 项目网站建设业务分析企业管理官网登录入口
  • 国外网站建设素材库上海做网站的公司有哪些
  • 办公室装修设计招商seo优化专员编辑
  • 招聘门户网站开发人员做做网站下载2023
  • 什么样的网站快速盈利做网页设计需要学什么
  • 安庆网站关键词优化天津网站设计公司
  • wordpress 下载站模板在网站设计公司上班好吗
  • 怎么让自己做的网站别人可以访问网站定制首页费用
  • 花生壳内网穿透网站如何做seo优化学校网站建设工作领导小组
  • 网站设计结构图用什么做青岛煜鹏网站建设公司
  • 宁津县建设局网站wordpress邮件发送
  • 软件技术跟网站开发有关系吗中企建设网站
  • 网站推广的主要方法天津城乡住房建设厅网站首页
  • 自己的网站是什么样子的东莞有哪些公司