Rollup merge of #77027 - termhn:mul_add_doc_change, r=m-ou-se

Improve documentation for `std::{f32,f64}::mul_add`

Makes it more clear that performance improvement is not guaranteed when using FMA, even when the target architecture supports it natively.
This commit is contained in:
Tyler Mandry 2020-12-10 21:32:59 -08:00 committed by GitHub
commit 1b4ffe4705
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 8 additions and 4 deletions

View File

@ -206,8 +206,10 @@ impl f32 {
/// Fused multiply-add. Computes `(self * a) + b` with only one rounding
/// error, yielding a more accurate result than an unfused multiply-add.
///
/// Using `mul_add` can be more performant than an unfused multiply-add if
/// the target architecture has a dedicated `fma` CPU instruction.
/// Using `mul_add` *may* be more performant than an unfused multiply-add if
/// the target architecture has a dedicated `fma` CPU instruction. However,
/// this is not always true, and will be heavily dependant on designing
/// algorithms with specific target hardware in mind.
///
/// # Examples
///

View File

@ -206,8 +206,10 @@ impl f64 {
/// Fused multiply-add. Computes `(self * a) + b` with only one rounding
/// error, yielding a more accurate result than an unfused multiply-add.
///
/// Using `mul_add` can be more performant than an unfused multiply-add if
/// the target architecture has a dedicated `fma` CPU instruction.
/// Using `mul_add` *may* be more performant than an unfused multiply-add if
/// the target architecture has a dedicated `fma` CPU instruction. However,
/// this is not always true, and will be heavily dependant on designing
/// algorithms with specific target hardware in mind.
///
/// # Examples
///