Commit Graph

98 Commits

Author SHA1 Message Date
AUTOMATIC1111
547778b10f possibly make NaN check cheaper 2024-06-08 12:41:38 +03:00
huchenlei
2a8a60c2c5 Add --precision half cmd option 2024-05-16 19:50:06 -04:00
Aarni Koskela
e3fa46f26f Fix various typos with crate-ci/typos 2024-03-04 08:42:07 +02:00
wangshuai09
cc3f604310 Update 2024-01-31 12:29:58 +08:00
wangshuai09
74ff85a1a1
Merge branch 'dev' into npu_support 2024-01-30 19:15:41 +08:00
Kohaku-Blueleaf
f9ba7e648a Revert "Try to reverse the dtype checking mechanism"
This reverts commit d243e24f53.
2024-01-29 22:54:12 +08:00
Kohaku-Blueleaf
d243e24f53 Try to reverse the dtype checking mechanism 2024-01-29 22:49:45 +08:00
Kohaku-Blueleaf
6e7f0860f7 linting 2024-01-29 22:46:43 +08:00
Kohaku-Blueleaf
750dd6014a Fix potential bugs 2024-01-29 22:27:53 +08:00
wangshuai09
ec124607f4 Add NPU Support 2024-01-29 19:25:06 +08:00
Kohaku-Blueleaf
4a66d2fb22 Avoid exceptions to be silenced 2024-01-20 16:33:59 +08:00
Kohaku-Blueleaf
81126027f5 Avoid early disable 2024-01-20 16:31:12 +08:00
Kohaku-Blueleaf
0181c1f76b Fix nested manual cast 2024-01-19 00:14:03 +08:00
Kohaku-Blueleaf
ca671e5d7b rearrange if-statements for cpu 2024-01-09 23:30:55 +08:00
Kohaku-Blueleaf
58d5b042cd Apply the correct behavior of precision='full' 2024-01-09 23:23:40 +08:00
Kohaku-Blueleaf
1fd69655fe Revert "Apply correct inference precision implementation"
This reverts commit e00365962b.
2024-01-09 23:15:05 +08:00
Kohaku-Blueleaf
e00365962b Apply correct inference precision implementation 2024-01-09 23:13:34 +08:00
Kohaku-Blueleaf
c2c05fcca8 linting and debugs 2024-01-09 22:53:58 +08:00
KohakuBlueleaf
42e6df723c Fix bugs when arg dtype doesn't match 2024-01-09 22:39:39 +08:00
Kohaku-Blueleaf
209c26a1cb improve efficiency and support more device 2024-01-09 22:11:44 +08:00
AUTOMATIC1111
a70dfb64a8 change import statements for #14478 2023-12-31 22:38:30 +03:00
Aarni Koskela
5768afc776 Add utility to inspect a model's parameters (to get dtype/device) 2023-12-31 13:22:43 +02:00
Kohaku-Blueleaf
9a15ae2a92 Merge branch 'dev' into test-fp8 2023-12-03 10:54:54 +08:00
AUTOMATIC1111
af5f0734c9
Merge pull request #14171 from Nuullll/ipex
Initial IPEX support for Intel Arc GPU
2023-12-02 19:22:32 +03:00
Kohaku-Blueleaf
110485d5bb Merge branch 'dev' into test-fp8 2023-12-02 17:00:09 +08:00
AUTOMATIC1111
88736b5557
Merge pull request #14131 from read-0nly/patch-1
Update devices.py - Make 'use-cpu all' actually apply to 'all'
2023-12-02 09:46:19 +03:00
Nuullll
7499148ad4 Disable ipex autocast due to its bad perf 2023-12-02 14:00:46 +08:00
Nuullll
8b40f475a3 Initial IPEX support 2023-11-30 20:22:46 +08:00
obsol
3cd6e1d0a0
Update devices.py
fixes issue where "--use-cpu" all properly makes SD run on CPU but leaves ControlNet (and other extensions, I presume) pointed at GPU, causing a crash in ControlNet caused by a mismatch between devices between SD and CN

https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/14097
2023-11-27 19:21:43 -05:00
Kohaku-Blueleaf
043d2edcf6 Better naming 2023-11-19 15:56:31 +08:00
Kohaku-Blueleaf
598da5cd49 Use options instead of cmd_args 2023-11-19 15:50:06 +08:00
KohakuBlueleaf
ddc2a3499b Add MPS manual cast 2023-10-28 16:52:35 +08:00
Kohaku-Blueleaf
d4d3134f6d ManualCast for 10/16 series gpu 2023-10-28 15:24:26 +08:00
Kohaku-Blueleaf
eaa9f5162f Add CPU fp8 support
Since norm layer need fp32, I only convert the linear operation layer(conv2d/linear)

And TE have some pytorch function not support bf16 amp in CPU. I add a condition to indicate if the autocast is for unet.
2023-10-24 01:49:05 +08:00
AUTOMATIC1111
46375f0592 fix for crash when running #12924 without --device-id 2023-09-09 09:39:37 +03:00
catboxanon
5681bf8016 More accurate check for enabling cuDNN benchmark on 16XX cards 2023-08-31 14:57:16 -04:00
AUTOMATIC1111
386245a264 split shared.py into multiple files; should resolve all circular reference import errors related to shared.py 2023-08-09 10:25:35 +03:00
AUTOMATIC1111
0d5dc9a6e7 rework RNG to use generators instead of generating noises beforehand 2023-08-09 08:43:31 +03:00
AUTOMATIC1111
fca42949a3 rework torchsde._brownian.brownian_interval replacement to use device.randn_local and respect the NV setting. 2023-08-03 07:18:55 +03:00
AUTOMATIC1111
84b6fcd02c add NV option for Random number generator source setting, which allows to generate same pictures on CPU/AMD/Mac as on NVidia videocards. 2023-08-03 00:00:23 +03:00
Aarni Koskela
b85fc7187d Fix MPS cache cleanup
Importing torch does not import torch.mps so the call failed.
2023-07-11 12:51:05 +03:00
AUTOMATIC1111
da8916f926 added torch.mps.empty_cache() to torch_gc()
changed a bunch of places that use torch.cuda.empty_cache() to use torch_gc() instead
2023-07-08 17:13:18 +03:00
Aarni Koskela
ba70a220e3 Remove a bunch of unused/vestigial code
As found by Vulture and some eyes
2023-06-05 22:43:57 +03:00
AUTOMATIC
8faac8b963 run basic torch calculation at startup in parallel to reduce the performance impact of first generation 2023-05-21 21:55:14 +03:00
AUTOMATIC
028d3f6425 ruff auto fixes 2023-05-10 11:05:02 +03:00
AUTOMATIC
5fe0dd79be rename CPU RNG to RNG source in settings, add infotext and parameters copypaste support to RNG source 2023-04-29 11:29:37 +03:00
Deciare
d40e44ade4 Option to use CPU for random number generation.
Makes a given manual seed generate the same images across different
platforms, independently of the GPU architecture in use.

Fixes #9613.
2023-04-18 23:27:46 -04:00
brkirch
1b8af15f13 Refactor Mac specific code to a separate file
Move most Mac related code to a separate file, don't even load it unless web UI is run under macOS.
2023-02-01 14:05:56 -05:00
brkirch
2217331cd1 Refactor MPS fixes to CondFunc 2023-02-01 06:36:22 -05:00
brkirch
7738c057ce MPS fix is still needed :(
Apparently I did not test with large enough images to trigger the bug with torch.narrow on MPS
2023-02-01 05:23:58 -05:00