Re: [intel-vaapi-media] [Intel-gfx] [RFC v3] drm/i915: Select engines via class and instance in execbuffer2
by Tvrtko Ursulin
On 18/05/2017 15:10, Emil Velikov wrote:
> Hi Tvrtko,
>
> On 18 May 2017 at 14:06, Tvrtko Ursulin <tvrtko.ursulin(a)linux.intel.com> wrote:
>
>> struct drm_i915_engine_info {
>> /** Engine instance number. */
>> __u32 instance;
>> __u32 rsvd;
>>
>> /** Engine specific features and info. */
>> #define DRM_I915_ENGINE_HAS_HEVC BIT(0)
>> __u8 features;
>> __u8 rsvd;
>>
>> __32 info;
>> };
>>
> To spare yourself from writing a compat ioctl, you want to have the
> members at the same offset on both 32 and 64bit builds.
> At the same times the whole struct size should be multiple of 64bit.
>
> This and a few others are covered in Daniel Vetter's "Botching up ioctls" [1]
Yes, thanks, I failed to add up to 64. :)
Regards,
Tvrtko
5 years, 1 month
Re: [intel-vaapi-media] [Intel-gfx] [RFC v3] drm/i915: Select engines via class and instance in execbuffer2
by Tvrtko Ursulin
On 17/05/2017 17:44, Chris Wilson wrote:
> On Wed, May 17, 2017 at 04:40:57PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
>>
>> Building on top of the previous patch which exported the concept
>> of engine classes and instances, we can also use this instead of
>> the current awkward engine selection uAPI.
>>
>> This is primarily interesting for the VCS engine selection which
>> is a) currently done via disjoint set of flags, and b) the
>> current I915_EXEC_BSD flags has different semantics depending on
>> the underlying hardware which is bad.
>>
>> Proposed idea here is to reserve 16-bits of flags, to pass in
>> the engine class and instance (8 bits each), and a new flag
>> named I915_EXEC_CLASS_INSTACE to tell the kernel this new engine
>> selection API is in use.
>>
>> The new uAPI also removes access to the weak VCS engine
>> balancing as currently existing in the driver.
>>
>> Example usage to send a command to VCS0:
>>
>> eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 0);
>>
>> Or to send a command to VCS1:
>>
>> eb.flags = i915_execbuffer2_engine(DRM_I915_ENGINE_CLASS_VIDEO_DECODE, 1);
>>
>> v2:
>> * Fix unknown flags mask.
>> * Use I915_EXEC_RING_MASK for class. (Chris Wilson)
>>
>> v3:
>> * Add a map for fast class-instance engine lookup. (Chris Wilson)
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
>> Cc: Ben Widawsky <ben(a)bwidawsk.net>
>> Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
>> Cc: Daniel Vetter <daniel.vetter(a)intel.com>
>> Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
>> Cc: Jon Bloomfield <jon.bloomfield(a)intel.com>
>> Cc: Daniel Charles <daniel.charles(a)intel.com>
>> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin(a)intel.com>
>> Cc: Oscar Mateo <oscar.mateo(a)intel.com>
>> Cc: "Gong, Zhipeng" <zhipeng.gong(a)intel.com>
>> Cc: intel-vaapi-media(a)lists.01.org
>> Cc: mesa-dev(a)lists.freedesktop.org
>> ---
>> drivers/gpu/drm/i915/i915_drv.h | 1 +
>> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 30 ++++++++++++++++++++++++++++++
>> drivers/gpu/drm/i915/i915_reg.h | 3 +++
>> drivers/gpu/drm/i915/intel_engine_cs.c | 7 +++++++
>> include/uapi/drm/i915_drm.h | 11 ++++++++++-
>> 5 files changed, 51 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 5dfa4a12e647..7bf4fd42480c 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2066,6 +2066,7 @@ struct drm_i915_private {
>> struct pci_dev *bridge_dev;
>> struct i915_gem_context *kernel_context;
>> struct intel_engine_cs *engine[I915_NUM_ENGINES];
>> + struct intel_engine_cs *engine_class[MAX_ENGINE_CLASS + 1][MAX_ENGINE_INSTANCE + 1];
>> struct i915_vma *semaphore;
>>
>> struct drm_dma_handle *status_page_dmah;
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index af1965774e7b..c1ad49ab64cd 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -1492,6 +1492,33 @@ gen8_dispatch_bsd_engine(struct drm_i915_private *dev_priv,
>> return file_priv->bsd_engine;
>> }
>>
>> +extern u8 user_class_map[DRM_I915_ENGINE_CLASS_MAX];
>> +
>> +static struct intel_engine_cs *get_engine_class(struct drm_i915_private *i915,
>> + u8 class, u8 instance)
>> +{
>> + if (class > MAX_ENGINE_CLASS || instance > MAX_ENGINE_INSTANCE)
>> + return NULL;
>> +
>> + return i915->engine_class[class][instance];
>> +}
>
> Be bold make this this an intel_engine_lookup(), I forsee some other
> users appearing very shortly.
Still static or you want to export it straight away? Preference for
where to put it? Here or intel_engine_cs.c?
>> +static struct intel_engine_cs *
>> +eb_select_engine_class_instance(struct drm_i915_private *i915,
>> + struct drm_i915_gem_execbuffer2 *args)
>> +{
>> + u8 class, instance;
>> +
>> + class = args->flags & I915_EXEC_RING_MASK;
>> + if (class >= DRM_I915_ENGINE_CLASS_MAX)
>> + return NULL;
>> +
>> + instance = (args->flags >> I915_EXEC_INSTANCE_SHIFT) &&
>> + I915_EXEC_INSTANCE_MASK;
>> +
>> + return get_engine_class(i915, user_class_map[class], instance);
>> +}
>> +
>> #define I915_USER_RINGS (4)
>>
>> static const enum intel_engine_id user_ring_map[I915_USER_RINGS + 1] = {
>> @@ -1510,6 +1537,9 @@ eb_select_engine(struct drm_i915_private *dev_priv,
>> unsigned int user_ring_id = args->flags & I915_EXEC_RING_MASK;
>> struct intel_engine_cs *engine;
>>
>> + if (args->flags & I915_EXEC_CLASS_INSTANCE)
>> + return eb_select_engine_class_instance(dev_priv, args);
>> +
>> if (user_ring_id > I915_USER_RINGS) {
>> DRM_DEBUG("execbuf with unknown ring: %u\n", user_ring_id);
>> return NULL;
>> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
>> index ee144ec57935..a3b59043b991 100644
>> --- a/drivers/gpu/drm/i915/i915_reg.h
>> +++ b/drivers/gpu/drm/i915/i915_reg.h
>> @@ -92,6 +92,9 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
>> #define VIDEO_ENHANCEMENT_CLASS 2
>> #define COPY_ENGINE_CLASS 3
>> #define OTHER_CLASS 4
>> +#define MAX_ENGINE_CLASS 4
>> +
>> +#define MAX_ENGINE_INSTANCE 1
>
> We also need the names in the uapi.h as well. Do we want to mention
> OTHER_CLASS before it is defined? I trust your crystal ball.
I have mentioned it just by the virtue of it existing in i915_reg.h
Crystal ball is otherwise quiet.
> Amusingly I notice that you do claim that you have these names in the
> changelog. :) We don't need DRM_I915 all the time. I915_ENGINE_CLASS is
> going to be unique, at least for those users of i915_drm.h
They are all there courtesy of the previous patch.
I will drop the DRM_ prefix throughout.
>
>> /* PCI config space */
>>
>> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
>> index 7566cf48012f..c5ad51c43d23 100644
>> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
>> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
>> @@ -225,7 +225,14 @@ intel_engine_setup(struct drm_i915_private *dev_priv,
>>
>> ATOMIC_INIT_NOTIFIER_HEAD(&engine->context_status_notifier);
>>
>> + if (WARN_ON(info->class > MAX_ENGINE_CLASS ||
>> + info->instance > MAX_ENGINE_INSTANCE ||
>> + dev_priv->engine_class[info->class][info->instance]))
>
> Spread these out or add a message telling what was wrong.
Can do, but I considered these to be such a basic programming errors
which should be caught during development. Only thing which prevented me
from putting in under the GEM_BUG_ON is the fact someone could not have
it enabled and that this function can already handle failures.
>> + return -EINVAL;
>> +
>> + dev_priv->engine_class[info->class][info->instance] = engine;
>> dev_priv->engine[id] = engine;
>> +
>> return 0;
>> }
>>
>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>> index 2ac6667e57ea..6a26bdf5e684 100644
>> --- a/include/uapi/drm/i915_drm.h
>> +++ b/include/uapi/drm/i915_drm.h
>> @@ -906,7 +906,12 @@ struct drm_i915_gem_execbuffer2 {
>> */
>> #define I915_EXEC_FENCE_OUT (1<<17)
>>
>> -#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_OUT<<1))
>> +#define I915_EXEC_CLASS_INSTANCE (1<<18)
>> +
>> +#define I915_EXEC_INSTANCE_SHIFT (19)
>> +#define I915_EXEC_INSTANCE_MASK (0xff << I915_EXEC_INSTANCE_SHIFT)
>> +
>> +#define __I915_EXEC_UNKNOWN_FLAGS (-((1 << 27) << 1))
>>
>> #define I915_EXEC_CONTEXT_ID_MASK (0xffffffff)
>> #define i915_execbuffer2_set_context_id(eb2, context) \
>> @@ -914,6 +919,10 @@ struct drm_i915_gem_execbuffer2 {
>> #define i915_execbuffer2_get_context_id(eb2) \
>> ((eb2).rsvd1 & I915_EXEC_CONTEXT_ID_MASK)
>>
>> +#define i915_execbuffer2_engine(class, instance) \
>> + (I915_EXEC_CLASS_INSTANCE | (class) | \
>> + ((instance) << I915_EXEC_INSTANCE_SHIFT))
>
> (class |
> (instance) << I915_EXEC_INSTANCE_SHIFT |
> I915_EXEC_CLASS_INSTANCE)
>
> Just suggesting to spread it out over another line for better
> readability.
Okay but I'd like for the flag to come first, followed by class and then
instance. Okay with that?
Regards,
Tvrtko
5 years, 1 month
Re: [intel-vaapi-media] [Intel-gfx] [RFC v3] drm/i915: Select engines via class and instance in execbuffer2
by Tvrtko Ursulin
On 18/05/2017 13:24, Chris Wilson wrote:
> On Thu, May 18, 2017 at 01:13:20PM +0100, Tvrtko Ursulin wrote:
>>
>> On 18/05/2017 12:10, Chris Wilson wrote:
>>> On Thu, May 18, 2017 at 01:55:59PM +0300, Joonas Lahtinen wrote:
>>>> On ke, 2017-05-17 at 16:40 +0100, Tvrtko Ursulin wrote:
>>>>> From: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
>>>>>
>>>>> Building on top of the previous patch which exported the concept
>>>>> of engine classes and instances, we can also use this instead of
>>>>> the current awkward engine selection uAPI.
>>>>>
>>>>> This is primarily interesting for the VCS engine selection which
>>>>> is a) currently done via disjoint set of flags, and b) the
>>>>> current I915_EXEC_BSD flags has different semantics depending on
>>>>> the underlying hardware which is bad.
>>>>>
>>>>> Proposed idea here is to reserve 16-bits of flags, to pass in
>>>>> the engine class and instance (8 bits each), and a new flag
>>>>> named I915_EXEC_CLASS_INSTACE to tell the kernel this new engine
>>>>> selection API is in use.
>>>
>>> Note this text doesn't describe the interface in v3.
>>>
>>>> Would it make sense to use bitmask for future proofing? Then we could
>>>> allow passing multiple options in the future.
>>>
>>> No. The first use case has to be explicit control of engine. That's
>>> orthogonal to asking to select any of a particular class.
>>
>> Was the suggestion to allow instance=any and instance=N? Or even all
>> the way to instance=N-or-M?
>>
>> If not the latter, then I can think of two other possible approaches:
>>
>> 1. Reserve 0xff as instance=any, aka 128 instances should be enough
>> for everbody. :) Could simply enforce in the uAPI that instance ==
>> I915_EXEC_INSTANCE_MASK = -EINVAL for now or something.
>>
>> 2. Do nothing now and add say I915_EXEC_CLASS at a later point. This
>> patch adds I915_EXEC_CLASS_INSTANCE so I915_EXEC_CLASS would not
>> sound out of place.
>
> Yes, I would argue to defer it until later. One problem I have at the
> moment is that not all members of a class are equal, HEVC-capable
> engines and the reset do not belong to the same class (i.e. my hope is
> that we could just say <class> | [ <mask> | INSTANCE_MASK ] | LOAD_BALANCE)
> So I can see the sense in having instance as a mask, or at least making
> the current instance field large enough to store a mask in future. I
> just feel uneasy as that field could grow quite large, and maybe it will
> be better to set the constraint via a context param (all dependency on
> frequency and tuning of the LOAD_BALANCE). Hmm, liking having the
> instance-mask on the context atm.
I don't think per context mask would work unless you won't to mandate
multi-contexts where they wouldn't otherwise be needed.
But this problem in general can also be solved separately from
class-instance addressing via engine feature masking.
As the GEM_ENGINE_INFO ioctl proposal defines a set of engine flags, at
a later point execbuf could be extended with a new flag. For example:
eb.flags = I915_EXEC_CLASS | I915_ENGINE_CLASS_VIDEO |
I915_EXEC_ENGINE_FEATURES | I915_ENGINE_HAS_HEVC;
Only problem I see that engine flags in the current proposal are u64 so
it would be a bit challenging to fit that in eb.flags.
Not sure if it would make sense to split the engine info flags into a
smaller and larger set. u8 which would be flags "compatible" with
I915_EXEC_ENGINE_FEATURES, and the remainder which would be something
else, or unused/reserved? Like:
struct drm_i915_engine_info {
/** Engine instance number. */
__u32 instance;
__u32 rsvd;
/** Engine specific features and info. */
#define DRM_I915_ENGINE_HAS_HEVC BIT(0)
__u8 features;
__u8 rsvd;
__32 info;
};
Or at that point you bring on eb3.
Regards,
Tvrtko
5 years, 1 month
Re: [intel-vaapi-media] [Intel-gfx] [RFC v3] drm/i915: Select engines via class and instance in execbuffer2
by Tvrtko Ursulin
On 18/05/2017 12:10, Chris Wilson wrote:
> On Thu, May 18, 2017 at 01:55:59PM +0300, Joonas Lahtinen wrote:
>> On ke, 2017-05-17 at 16:40 +0100, Tvrtko Ursulin wrote:
>>> From: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
>>>
>>> Building on top of the previous patch which exported the concept
>>> of engine classes and instances, we can also use this instead of
>>> the current awkward engine selection uAPI.
>>>
>>> This is primarily interesting for the VCS engine selection which
>>> is a) currently done via disjoint set of flags, and b) the
>>> current I915_EXEC_BSD flags has different semantics depending on
>>> the underlying hardware which is bad.
>>>
>>> Proposed idea here is to reserve 16-bits of flags, to pass in
>>> the engine class and instance (8 bits each), and a new flag
>>> named I915_EXEC_CLASS_INSTACE to tell the kernel this new engine
>>> selection API is in use.
>
> Note this text doesn't describe the interface in v3.
>
>> Would it make sense to use bitmask for future proofing? Then we could
>> allow passing multiple options in the future.
>
> No. The first use case has to be explicit control of engine. That's
> orthogonal to asking to select any of a particular class.
Was the suggestion to allow instance=any and instance=N? Or even all the
way to instance=N-or-M?
If not the latter, then I can think of two other possible approaches:
1. Reserve 0xff as instance=any, aka 128 instances should be enough for
everbody. :) Could simply enforce in the uAPI that instance ==
I915_EXEC_INSTANCE_MASK = -EINVAL for now or something.
2. Do nothing now and add say I915_EXEC_CLASS at a later point. This
patch adds I915_EXEC_CLASS_INSTANCE so I915_EXEC_CLASS would not sound
out of place.
Regards,
Tvrtko
5 years, 1 month
Ubuntu/Linux recipe for Full HW acceleration HEVC decode on Skylake
by Ainapure, Rohit
Hi
I wanted to know if there is a recipe/app to utlilze the Full Hardware acceleration with 8-bit HEVC on Skylake using Ubuntu?
I know there is a recipe for CentOS, what I am really interested in is if there is one for Ubuntu, since we @Dolby have all our decoders & encoders running on Ubuntu.?
I will be able to build a custom kernel with the latest bits from 01.org to make sure va-api is the latest.
Q] Is there a branch of the Linux kernel which is self-sufficient at this stage to utilize the Full 8-bit HW acceleration on Skylake? Is the MediaSDK/MediaStudio still necessary, in terms of installing shared libraries / plugins etc.?
Q] Is there a way through the linux kernel that I can permanently select all the decode to be done using Full HW acceleration, assuming all input being 8-bit HEVC? This would enable me to use our own decoder and get some numbers on its performance using full HW acceleration.Would also like to measure how it does with Hybrid mode for 10-bit HEVC input. May be there is no need for it, may be the Linux kernel uses HW acceleration by default if the platform supports it?
Q] Was wondering if there is a sample_decode / sample_encode app for Ubuntu? Something similar to the one released with the MediaSDK/MSS.
We are using the Skylake NUC i7 for now
http://www.intel.com/content/www/us/en/nuc/nuc-kit-nuc6i7kyk-features-con...
Q] May not be the best place to ask this but wanted to know if the above SKU requires me to use the HW plugin while using the sample_decode from the Media SDK on CentOS -hw -p 33a61c0b4c27454ca8d85dde757c6f8e
BR,
Rohit
5 years, 1 month